Search Shortcut cmd + k | ctrl + k
tpch_rust

Generate TPC-H data as DuckDB tables or Parquet files, powered by the pure-Rust tpchgen-rs generators

Maintainer(s): guillesd

Installing and Loading

INSTALL tpch_rust FROM community;
LOAD tpch_rust;

Example

-- Stream a single TPC-H table as a DuckDB table function and materialize it:
CREATE TABLE lineitem AS FROM tpch(1, 'lineitem');

-- Or query generated rows directly without materializing:
SELECT count(*) FROM tpch(0.1, 'orders');

-- Get a ready-to-run load script for the full 8-table schema:
SELECT statement FROM tpch_load(1);

-- Write Parquet files to disk instead (optionally a subset of tables):
CALL tpch_gen(1, 'data', tables := ['lineitem', 'orders']);

About tpch_rust

tpch_rust

A Rust DuckDB extension (built on the C-extension template) that generates TPC-H data using the pure-Rust tpchgen-rs generators. It offers two modes:

  • DuckDB tablestpch(sf, 'table') is a table function that streams the rows of a single TPC-H table, so you can materialize it with CREATE TABLE ... AS FROM tpch(...). tpch_load(sf) returns a ready-to-run load script for the full 8-table schema.
  • Parquet filestpch_gen(sf, output_dir, [tables]) writes Parquet files directly to disk, with knobs for compression, row_group_bytes, and parts (output-file count).

Supported tables: nation, region, part, supplier, partsupp, customer, orders, lineitem. Generation is parallelized across CPU cores and streamed in bounded batches, so memory stays flat regardless of scale factor.

See the project README for benchmarks and the full set of parameters.

Added Functions

function_name function_type description comment examples
tpch table NULL NULL  
tpch_gen table NULL NULL  
tpch_load table NULL NULL  

Overloaded Functions

This extension does not add any function overloads.

Added Types

This extension does not add any types.

Added Settings

This extension does not add any settings.