-- Set both input and output dialects to DuckDB for a first round trip.DSETlpts_input_dialect='duckdb';DSETlpts_dialect='duckdb';DCREATETABLEusers(idINTEGER,nameVARCHAR,ageINTEGER);DINSERTINTOusersVALUES(1,'Alice',30),(2,'Bob',22),(3,'Carol',28);-- Return the optimized plan as readable CTE SQL.DPRAGMAlpts('SELECT name FROM users WHERE age > 25');WITHt0_scan(t0_name)AS(SELECT"name"FROMmemory.main.usersWHERE(age>25))SELECTt0_nameAS"name"FROMt0_scan;-- Check that the generated SQL returns the same bag of rows as the input query.DPRAGMAlpts_check('SELECT name FROM users WHERE age > 25');┌─────────┐│match││boolean│├─────────┤│true│└─────────┘
LPTS is a DuckDB extension for optimized-plan inspection and
cross-system SQL transpilation. LPTS takes DuckDB's post-optimizer
logical plan and reconstructs equivalent SQL as a sequence of named CTEs.
LPTS also extends DuckDB's EXPLAIN with a SQL format. `EXPLAIN (FORMAT SQL)
` returns the optimized logical plan rendered as equivalent CTE SQL — the
same output as `PRAGMA lpts`, but exposed as a first-class `EXPLAIN` statement
(similar to the plan-as-SQL output in Umbra). The CLI prints the SQL as plain
multi-line text, while JDBC, Python, and other clients receive the standard
two-column (`explain_key`, `explain_value`) EXPLAIN result. It honors
`lpts_dialect` just like `PRAGMA lpts`.
```sql
EXPLAIN (FORMAT SQL) SELECT name FROM users WHERE age > 25;
```
```text
┌─────────────────────────────┐
│┌───────────────────────────┐│
││ Optimized Logical Plan ││
│└───────────────────────────┘│
└─────────────────────────────┘
WITH
t0_scan (t0_name) AS (
SELECT "name"
FROM memory.main.users
WHERE (age>25)
)
SELECT t0_name AS "name"
FROM t0_scan;
```
LPTS plans the query through DuckDB, optimizes it, then serializes the
optimized logical plan.
## Supported Dialects
The dialect settings accept these values:
| Dialect | Accepted values |
|---|---|
| DuckDB | `duckdb` |
| PostgreSQL | `postgres`, `postgresql` |
| Spark SQL | `spark` |
| Hive | `hive` |
| Trino / Presto | `trino`, `presto` |
| Snowflake | `snowflake` |
| BigQuery | `bigquery`, `bq` |
| Redshift | `redshift` |
| MySQL / MariaDB | `mysql`, `mariadb` |
## Use Cases
- Inspect optimized DuckDB plans as SQL.
- Debug optimizer rewrites such as filter pushdown, join reordering, top-N,
materialized CTEs, and subquery decorrelation.
- Generate a CTE program that communicates the optimized execution shape.
- Emit SQL for another engine with `lpts_dialect`.
- Convert other SQL dialect syntax to DuckDB SQL with `lpts_input_dialect`,
then execute or inspect it.
## Supported Operators
LPTS is intended to cover all logical operators produced by optimized DuckDB
SELECT plans. The current regression suite round-trips all 22 TPC-H queries
and exercises joins, aggregates, windows, set operations, CTEs, recursive
CTEs, table functions, DuckLake scans, and inserts.
Unsupported optimizer edge cases fail explicitly with `NotImplementedException`.
## Examples
```sql
D CREATE TABLE events (id INTEGER, ts TIMESTAMP, name VARCHAR, "order" INTEGER);
D INSERT INTO events VALUES
(1, TIMESTAMP '2024-01-15 08:09:10', 'alpha', 10),
(11, TIMESTAMP '2024-01-16 11:12:13', 'beta', 20);
-- Render generated SQL for PostgreSQL.
D SET lpts_dialect = 'postgres';
-- Return generated CTE SQL directly in the shell.
D PRAGMA lpts(
'SELECT strftime(ts, ''%Y-%m-%d'') AS day
FROM events
WHERE id > 10
ORDER BY day'
);
-- Return generated CTE SQL as a table row, useful in scripts and tests.
D SELECT sql
FROM lpts_query(
'SELECT strftime(ts, ''%Y-%m-%d'') AS day
FROM events
WHERE id > 10
ORDER BY day'
);
```
```text
WITH
t0_scan (ts) AS (
SELECT ts
FROM events
WHERE (id>10)
)
SELECT to_char(ts, 'YYYY-MM-DD') AS "day"
FROM t0_scan
ORDER BY to_char(ts, 'YYYY-MM-DD') ASC NULLS LAST;
```
```sql
-- Switch back to DuckDB rendering.
D SET lpts_dialect = 'duckdb';
-- Execute the generated SQL and return the query result.
D PRAGMA lpts_exec('SELECT name FROM events WHERE id > 10 ORDER BY name');
-- Compare original and generated SQL using bag equality.
D PRAGMA lpts_check('SELECT name FROM events WHERE id > 10 ORDER BY name');
```
```text
┌─────────┐
│ name │
│ varchar │
├─────────┤
│ beta │
└─────────┘
┌─────────┐
│ match │
│ boolean │
├─────────┤
│ true │
└─────────┘
```
```sql
-- Print the AST tree to stdout for interactive debugging.
D PRAGMA print_ast('SELECT name FROM events WHERE id > 10 ORDER BY name');
-- Return the AST tree as a table row, useful for tools and regression tests.
D SELECT ast
FROM print_ast_query('SELECT name FROM events WHERE id > 10 ORDER BY name');
```
```sql
-- Normalize MySQL syntax before DuckDB parses and plans the query.
D SET lpts_input_dialect = 'mysql';
-- Return source-dialect SQL normalized to DuckDB SQL.
D SELECT sql
FROM lpts_normalize_query(
'SELECT `order`, DATE_FORMAT(ts, ''%Y-%m-%d %H:%i:%s'') AS formatted FROM events LIMIT 5, 10'
);
```
```text
SELECT "order", strftime(ts, '%Y-%m-%d %H:%M:%S') AS formatted FROM events LIMIT 10 OFFSET 5
```
### Added Functions
| function_name | function_type | description | comment | examples |
|----------------------|---------------|------------------------------------------------------------------------------------------------|---------|------------------------------------------------------------------------------------------------------------------|
| lpts | pragma | Return the optimized logical plan for a query as equivalent CTE SQL. | NULL | [PRAGMA lpts('SELECT name FROM users WHERE age > 25')] |
| lpts_check | pragma | Compare the original query and the LPTS-generated query using EXCEPT ALL in both directions. | NULL | [PRAGMA lpts_check('SELECT name FROM users WHERE age > 25')] |
| lpts_exec | pragma | Execute the SQL generated by LPTS and return its result rows. | NULL | [PRAGMA lpts_exec('SELECT name FROM users WHERE age > 25')] |
| lpts_normalize_query | table | Normalize SQL from lpts_input_dialect into DuckDB SQL without planning it. | NULL | [SET lpts_input_dialect = 'mysql'; SELECT sql FROM lpts_normalize_query('SELECT `order` FROM users LIMIT 1, 2')] |
| lpts_query | table | Table-function form of PRAGMA lpts. Returns the generated CTE SQL as a single sql column. | NULL | [SELECT sql FROM lpts_query('SELECT name FROM users WHERE age > 25')] |
| print_ast | pragma | Print the LPTS AST tree for a query to stdout. | NULL | [PRAGMA print_ast('SELECT name FROM users WHERE age > 25')] |
| print_ast_query | table | Table-function form of PRAGMA print_ast. Returns the rendered AST tree as a single ast column. | NULL | [SELECT ast FROM print_ast_query('SELECT name FROM users WHERE age > 25')] |
### Overloaded Functions
This extension does not add any function overloads.
### Added Types
This extension does not add any types.
### Added Settings
| name | description | input_type | scope | aliases |
|---------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------|--------|---------|
| lpts_dialect | SQL dialect for lpts output. Valid values: 'duckdb' (default), 'postgres', 'spark', 'hive', 'trino', 'presto', 'snowflake', 'bigquery', 'redshift', 'mysql', 'mariadb' | VARCHAR | GLOBAL | [] |
| lpts_enable_data_dependent_optimizers | Enable LPTS planning optimizers that depend on current data, statistics, cardinality estimates, row groups, or runtime dynamic filters. | BOOLEAN | GLOBAL | [] |
| lpts_input_dialect | SQL dialect for lpts input normalization. Valid values: 'duckdb' (default), 'postgres', 'spark', 'hive', 'trino', 'presto', 'snowflake', 'bigquery', 'redshift', 'mysql', 'mariadb' | VARCHAR | GLOBAL | [] |
| lpts_merge_pipeline | Fuse chains of single-child pipeline operators (Limit/OrderBy/Project/Aggregate/Filter, and pushdown-free base-table scans) into one flat SELECT per query block instead of one CTE per operator. | BOOLEAN | GLOBAL | [] |