Generate TPC-H data as DuckDB tables or Parquet files, powered by the pure-Rust tpchgen-rs generators
Maintainer(s):
guillesd
Installing and Loading
INSTALL tpch_rust FROM community;
LOAD tpch_rust;
Example
-- Stream a single TPC-H table as a DuckDB table function and materialize it:
CREATE TABLE lineitem AS FROM tpch(1, 'lineitem');
-- Or query generated rows directly without materializing:
SELECT count(*) FROM tpch(0.1, 'orders');
-- Get a ready-to-run load script for the full 8-table schema:
SELECT statement FROM tpch_load(1);
-- Write Parquet files to disk instead (optionally a subset of tables):
CALL tpch_gen(1, 'data', tables := ['lineitem', 'orders']);
About tpch_rust
tpch_rust
A Rust DuckDB extension (built on the C-extension template) that generates
TPC-H data using the pure-Rust
tpchgen-rs generators. It offers two modes:
- DuckDB tables —
tpch(sf, 'table')is a table function that streams the rows of a single TPC-H table, so you can materialize it withCREATE TABLE ... AS FROM tpch(...).tpch_load(sf)returns a ready-to-run load script for the full 8-table schema. - Parquet files —
tpch_gen(sf, output_dir, [tables])writes Parquet files directly to disk, with knobs forcompression,row_group_bytes, andparts(output-file count).
Supported tables: nation, region, part, supplier, partsupp, customer, orders,
lineitem. Generation is parallelized across CPU cores and streamed in bounded batches, so
memory stays flat regardless of scale factor.
See the project README for benchmarks and the full set of parameters.
Added Functions
| function_name | function_type | description | comment | examples |
|---|---|---|---|---|
| tpch | table | NULL | NULL | |
| tpch_gen | table | NULL | NULL | |
| tpch_load | table | NULL | NULL |
Overloaded Functions
This extension does not add any function overloads.
Added Types
This extension does not add any types.
Added Settings
This extension does not add any settings.