Search Shortcut cmd + k | ctrl + k
lpts

Optimized-plan inspection and cross-system SQL transpilation

Maintainer(s): ila

Installing and Loading

INSTALL lpts FROM community;
LOAD lpts;

Example

-- Set both input and output dialects to DuckDB for a first round trip.
D SET lpts_input_dialect = 'duckdb';
D SET lpts_dialect = 'duckdb';

D CREATE TABLE users (id INTEGER, name VARCHAR, age INTEGER);
D INSERT INTO users VALUES (1, 'Alice', 30), (2, 'Bob', 22), (3, 'Carol', 28);

-- Return the optimized plan as readable CTE SQL.
D PRAGMA lpts('SELECT name FROM users WHERE age > 25');
WITH
t0_scan (t0_name) AS (
    SELECT  "name"
    FROM    memory.main.users
    WHERE   (age>25)
)
SELECT  t0_name AS "name"
FROM    t0_scan;

-- Check that the generated SQL returns the same bag of rows as the input query.
D PRAGMA lpts_check('SELECT name FROM users WHERE age > 25');
┌─────────┐
  match  
 boolean 
├─────────┤
 true    
└─────────┘

About lpts

LPTS is a DuckDB extension for optimized-plan inspection and cross-system SQL transpilation. LPTS takes DuckDB's post-optimizer logical plan and reconstructs equivalent SQL as a sequence of named CTEs.

PRAGMA Syntax

PRAGMA lpts('<query>');

EXPLAIN (FORMAT SQL)

LPTS also extends DuckDB's EXPLAIN with a SQL format. `EXPLAIN (FORMAT SQL)

` returns the optimized logical plan rendered as equivalent CTE SQL — the same output as `PRAGMA lpts`, but exposed as a first-class `EXPLAIN` statement (similar to the plan-as-SQL output in Umbra). The CLI prints the SQL as plain multi-line text, while JDBC, Python, and other clients receive the standard two-column (`explain_key`, `explain_value`) EXPLAIN result. It honors `lpts_dialect` just like `PRAGMA lpts`. ```sql EXPLAIN (FORMAT SQL) SELECT name FROM users WHERE age > 25; ``` ```text ┌─────────────────────────────┐ │┌───────────────────────────┐│ ││ Optimized Logical Plan ││ │└───────────────────────────┘│ └─────────────────────────────┘ WITH t0_scan (t0_name) AS ( SELECT "name" FROM memory.main.users WHERE (age>25) ) SELECT t0_name AS "name" FROM t0_scan; ``` LPTS plans the query through DuckDB, optimizes it, then serializes the optimized logical plan. ## Supported Dialects The dialect settings accept these values: | Dialect | Accepted values | |---|---| | DuckDB | `duckdb` | | PostgreSQL | `postgres`, `postgresql` | | Spark SQL | `spark` | | Hive | `hive` | | Trino / Presto | `trino`, `presto` | | Snowflake | `snowflake` | | BigQuery | `bigquery`, `bq` | | Redshift | `redshift` | | MySQL / MariaDB | `mysql`, `mariadb` | ## Use Cases - Inspect optimized DuckDB plans as SQL. - Debug optimizer rewrites such as filter pushdown, join reordering, top-N, materialized CTEs, and subquery decorrelation. - Generate a CTE program that communicates the optimized execution shape. - Emit SQL for another engine with `lpts_dialect`. - Convert other SQL dialect syntax to DuckDB SQL with `lpts_input_dialect`, then execute or inspect it. ## Supported Operators LPTS is intended to cover all logical operators produced by optimized DuckDB SELECT plans. The current regression suite round-trips all 22 TPC-H queries and exercises joins, aggregates, windows, set operations, CTEs, recursive CTEs, table functions, DuckLake scans, and inserts. Unsupported optimizer edge cases fail explicitly with `NotImplementedException`. ## Examples ```sql D CREATE TABLE events (id INTEGER, ts TIMESTAMP, name VARCHAR, "order" INTEGER); D INSERT INTO events VALUES (1, TIMESTAMP '2024-01-15 08:09:10', 'alpha', 10), (11, TIMESTAMP '2024-01-16 11:12:13', 'beta', 20); -- Render generated SQL for PostgreSQL. D SET lpts_dialect = 'postgres'; -- Return generated CTE SQL directly in the shell. D PRAGMA lpts( 'SELECT strftime(ts, ''%Y-%m-%d'') AS day FROM events WHERE id > 10 ORDER BY day' ); -- Return generated CTE SQL as a table row, useful in scripts and tests. D SELECT sql FROM lpts_query( 'SELECT strftime(ts, ''%Y-%m-%d'') AS day FROM events WHERE id > 10 ORDER BY day' ); ``` ```text WITH t0_scan (ts) AS ( SELECT ts FROM events WHERE (id>10) ) SELECT to_char(ts, 'YYYY-MM-DD') AS "day" FROM t0_scan ORDER BY to_char(ts, 'YYYY-MM-DD') ASC NULLS LAST; ``` ```sql -- Switch back to DuckDB rendering. D SET lpts_dialect = 'duckdb'; -- Execute the generated SQL and return the query result. D PRAGMA lpts_exec('SELECT name FROM events WHERE id > 10 ORDER BY name'); -- Compare original and generated SQL using bag equality. D PRAGMA lpts_check('SELECT name FROM events WHERE id > 10 ORDER BY name'); ``` ```text ┌─────────┐ │ name │ │ varchar │ ├─────────┤ │ beta │ └─────────┘ ┌─────────┐ │ match │ │ boolean │ ├─────────┤ │ true │ └─────────┘ ``` ```sql -- Print the AST tree to stdout for interactive debugging. D PRAGMA print_ast('SELECT name FROM events WHERE id > 10 ORDER BY name'); -- Return the AST tree as a table row, useful for tools and regression tests. D SELECT ast FROM print_ast_query('SELECT name FROM events WHERE id > 10 ORDER BY name'); ``` ```sql -- Normalize MySQL syntax before DuckDB parses and plans the query. D SET lpts_input_dialect = 'mysql'; -- Return source-dialect SQL normalized to DuckDB SQL. D SELECT sql FROM lpts_normalize_query( 'SELECT `order`, DATE_FORMAT(ts, ''%Y-%m-%d %H:%i:%s'') AS formatted FROM events LIMIT 5, 10' ); ``` ```text SELECT "order", strftime(ts, '%Y-%m-%d %H:%M:%S') AS formatted FROM events LIMIT 10 OFFSET 5 ``` ### Added Functions
| function_name | function_type | description | comment | examples | |----------------------|---------------|------------------------------------------------------------------------------------------------|---------|------------------------------------------------------------------------------------------------------------------| | lpts | pragma | Return the optimized logical plan for a query as equivalent CTE SQL. | NULL | [PRAGMA lpts('SELECT name FROM users WHERE age > 25')] | | lpts_check | pragma | Compare the original query and the LPTS-generated query using EXCEPT ALL in both directions. | NULL | [PRAGMA lpts_check('SELECT name FROM users WHERE age > 25')] | | lpts_exec | pragma | Execute the SQL generated by LPTS and return its result rows. | NULL | [PRAGMA lpts_exec('SELECT name FROM users WHERE age > 25')] | | lpts_normalize_query | table | Normalize SQL from lpts_input_dialect into DuckDB SQL without planning it. | NULL | [SET lpts_input_dialect = 'mysql'; SELECT sql FROM lpts_normalize_query('SELECT `order` FROM users LIMIT 1, 2')] | | lpts_query | table | Table-function form of PRAGMA lpts. Returns the generated CTE SQL as a single sql column. | NULL | [SELECT sql FROM lpts_query('SELECT name FROM users WHERE age > 25')] | | print_ast | pragma | Print the LPTS AST tree for a query to stdout. | NULL | [PRAGMA print_ast('SELECT name FROM users WHERE age > 25')] | | print_ast_query | table | Table-function form of PRAGMA print_ast. Returns the rendered AST tree as a single ast column. | NULL | [SELECT ast FROM print_ast_query('SELECT name FROM users WHERE age > 25')] | ### Overloaded Functions
This extension does not add any function overloads. ### Added Types
This extension does not add any types. ### Added Settings
| name | description | input_type | scope | aliases | |---------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------|--------|---------| | lpts_dialect | SQL dialect for lpts output. Valid values: 'duckdb' (default), 'postgres', 'spark', 'hive', 'trino', 'presto', 'snowflake', 'bigquery', 'redshift', 'mysql', 'mariadb' | VARCHAR | GLOBAL | [] | | lpts_enable_data_dependent_optimizers | Enable LPTS planning optimizers that depend on current data, statistics, cardinality estimates, row groups, or runtime dynamic filters. | BOOLEAN | GLOBAL | [] | | lpts_input_dialect | SQL dialect for lpts input normalization. Valid values: 'duckdb' (default), 'postgres', 'spark', 'hive', 'trino', 'presto', 'snowflake', 'bigquery', 'redshift', 'mysql', 'mariadb' | VARCHAR | GLOBAL | [] | | lpts_merge_pipeline | Fuse chains of single-child pipeline operators (Limit/OrderBy/Project/Aggregate/Filter, and pushdown-free base-table scans) into one flat SELECT per query block instead of one CTE per operator. | BOOLEAN | GLOBAL | [] |