Search Shortcut cmd + k | ctrl + k
lpts

Optimized-plan inspection and cross-system SQL transpilation

Maintainer(s): ila

Installing and Loading

INSTALL lpts FROM community;
LOAD lpts;

Example

-- Set both input and output dialects to DuckDB for a first round trip.
D SET lpts_input_dialect = 'duckdb';
D SET lpts_dialect = 'duckdb';

D CREATE TABLE users (id INTEGER, name VARCHAR, age INTEGER);
D INSERT INTO users VALUES (1, 'Alice', 30), (2, 'Bob', 22), (3, 'Carol', 28);

-- Return the optimized plan as readable CTE SQL.
D PRAGMA lpts('SELECT name FROM users WHERE age > 25');
┌──────────────────────────────────────────────────────────────────────────────┐
 sql                                                                          
 varchar                                                                      
├──────────────────────────────────────────────────────────────────────────────┤
 WITH scan_0 (t0_name) AS (SELECT name FROM memory.main.users WHERE age>25), 
 projection_1 (t1_name) AS (SELECT t0_name FROM scan_0)                      
 SELECT t1_name AS "name" FROM projection_1;                                 
└──────────────────────────────────────────────────────────────────────────────┘

-- Check that the generated SQL returns the same bag of rows as the input query.
D PRAGMA lpts_check('SELECT name FROM users WHERE age > 25');
┌─────────┐
  match  
 boolean 
├─────────┤
 true    
└─────────┘

About lpts

LPTS is a DuckDB extension for optimized-plan inspection and cross-system SQL transpilation. LPTS takes DuckDB's post-optimizer logical plan and reconstructs equivalent SQL as a sequence of named CTEs.

PRAGMA Syntax

PRAGMA lpts('<query>');

Example:

-- Set both input and output dialects to DuckDB for a first round trip.
D SET lpts_input_dialect = 'duckdb';
D SET lpts_dialect = 'duckdb';

D CREATE TABLE users (id INTEGER, name VARCHAR, age INTEGER);
D INSERT INTO users VALUES (1, 'Alice', 30), (2, 'Bob', 22), (3, 'Carol', 28);

-- Return the optimized plan as readable CTE SQL.
D PRAGMA lpts('SELECT name FROM users WHERE age > 25');
┌──────────────────────────────────────────────────────────────────────────────┐
│ sql                                                                          │
│ varchar                                                                      │
├──────────────────────────────────────────────────────────────────────────────┤
│ WITH scan_0 (t0_name) AS (SELECT name FROM memory.main.users WHERE age>25), │
│ projection_1 (t1_name) AS (SELECT t0_name FROM scan_0)                      │
│ SELECT t1_name AS "name" FROM projection_1;                                 │
└──────────────────────────────────────────────────────────────────────────────┘

LPTS plans the query through DuckDB, optimizes it, then serializes the optimized logical plan.

Supported Dialects

The dialect settings accept these values:

Dialect Accepted values
DuckDB duckdb
PostgreSQL postgres, postgresql
Spark SQL spark
Hive hive
Trino / Presto trino, presto
Snowflake snowflake
BigQuery bigquery, bq
Redshift redshift
MySQL / MariaDB mysql, mariadb

Use Cases

  • Inspect optimized DuckDB plans as SQL.
  • Debug optimizer rewrites such as filter pushdown, join reordering, top-N, materialized CTEs, and subquery decorrelation.
  • Generate a CTE program that communicates the optimized execution shape.
  • Emit SQL for another engine with lpts_dialect.
  • Convert other SQL dialect syntax to DuckDB SQL with lpts_input_dialect, then execute or inspect it.

Supported Operators

LPTS is intended to cover all logical operators produced by optimized DuckDB SELECT plans. The current regression suite round-trips all 22 TPC-H queries and exercises joins, aggregates, windows, set operations, CTEs, recursive CTEs, table functions, DuckLake scans, and inserts.

Unsupported optimizer edge cases fail explicitly with NotImplementedException.

Examples

D CREATE TABLE events (id INTEGER, ts TIMESTAMP, name VARCHAR, "order" INTEGER);
D INSERT INTO events VALUES
    (1, TIMESTAMP '2024-01-15 08:09:10', 'alpha', 10),
    (11, TIMESTAMP '2024-01-16 11:12:13', 'beta', 20);

-- Render generated SQL for PostgreSQL.
D SET lpts_dialect = 'postgres';

-- Return generated CTE SQL directly in the shell.
D PRAGMA lpts(
    'SELECT strftime(ts, ''%Y-%m-%d'') AS day
     FROM events
     WHERE id > 10
     ORDER BY day'
);

-- Return generated CTE SQL as a table row, useful in scripts and tests.
D SELECT sql
FROM lpts_query(
    'SELECT strftime(ts, ''%Y-%m-%d'') AS day
     FROM events
     WHERE id > 10
     ORDER BY day'
);
┌────────────────────────────────────────────────────────────────────────────────────────┐
│ sql                                                                                    │
│ varchar                                                                                │
├────────────────────────────────────────────────────────────────────────────────────────┤
│ WITH scan_0 (t0_ts) AS (SELECT ts FROM events WHERE id>10),                           │
│ projection_1 (t1_day) AS (SELECT to_char(t0_ts, 'YYYY-MM-DD') FROM scan_0),            │
│ order_2 (t1_day) AS (SELECT t1_day FROM projection_1 ORDER BY t1_day ASC NULLS LAST)   │
│ SELECT t1_day AS "day" FROM order_2;                                                   │
└────────────────────────────────────────────────────────────────────────────────────────┘
-- Switch back to DuckDB rendering.
D SET lpts_dialect = 'duckdb';

-- Execute the generated SQL and return the query result.
D PRAGMA lpts_exec('SELECT name FROM events WHERE id > 10 ORDER BY name');

-- Compare original and generated SQL using bag equality.
D PRAGMA lpts_check('SELECT name FROM events WHERE id > 10 ORDER BY name');
┌─────────┐
│  name   │
│ varchar │
├─────────┤
│ beta    │
└─────────┘

┌─────────┐
│  match  │
│ boolean │
├─────────┤
│ true    │
└─────────┘
-- Print the AST tree to stdout for interactive debugging.
D PRAGMA print_ast('SELECT name FROM events WHERE id > 10 ORDER BY name');

-- Return the AST tree as a table row, useful for tools and regression tests.
D SELECT ast
FROM print_ast_query('SELECT name FROM events WHERE id > 10 ORDER BY name');
-- Normalize MySQL syntax before DuckDB parses and plans the query.
D SET lpts_input_dialect = 'mysql';

-- Return source-dialect SQL normalized to DuckDB SQL.
D SELECT sql
FROM lpts_normalize_query(
    'SELECT `order`, DATE_FORMAT(ts, ''%Y-%m-%d %H:%i:%s'') AS formatted FROM events LIMIT 5, 10'
);
SELECT "order", strftime(ts, '%Y-%m-%d %H:%M:%S') AS formatted FROM events LIMIT 10 OFFSET 5

Added Functions

function_name function_type description comment examples
lpts pragma Return the optimized logical plan for a query as equivalent CTE SQL. NULL [PRAGMA lpts('SELECT name FROM users WHERE age > 25')]
lpts_check pragma Compare the original query and the LPTS-generated query using EXCEPT ALL in both directions. NULL [PRAGMA lpts_check('SELECT name FROM users WHERE age > 25')]
lpts_exec pragma Execute the SQL generated by LPTS and return its result rows. NULL [PRAGMA lpts_exec('SELECT name FROM users WHERE age > 25')]
lpts_normalize_query table Normalize SQL from lpts_input_dialect into DuckDB SQL without planning it. NULL [SET lpts_input_dialect = 'mysql'; SELECT sql FROM lpts_normalize_query('SELECT order FROM users LIMIT 1, 2')]
lpts_query table Table-function form of PRAGMA lpts. Returns the generated CTE SQL as a single sql column. NULL [SELECT sql FROM lpts_query('SELECT name FROM users WHERE age > 25')]
print_ast pragma Print the LPTS AST tree for a query to stdout. NULL [PRAGMA print_ast('SELECT name FROM users WHERE age > 25')]
print_ast_query table Table-function form of PRAGMA print_ast. Returns the rendered AST tree as a single ast column. NULL [SELECT ast FROM print_ast_query('SELECT name FROM users WHERE age > 25')]

Overloaded Functions

This extension does not add any function overloads.

Added Types

This extension does not add any types.

Added Settings

name description input_type scope aliases
lpts_dialect SQL dialect for lpts output. Valid values: 'duckdb' (default), 'postgres', 'spark', 'hive', 'trino', 'presto', 'snowflake', 'bigquery', 'redshift', 'mysql', 'mariadb' VARCHAR GLOBAL []
lpts_enable_data_dependent_optimizers Enable LPTS planning optimizers that depend on current data, statistics, cardinality estimates, row groups, or runtime dynamic filters. BOOLEAN GLOBAL []
lpts_input_dialect SQL dialect for lpts input normalization. Valid values: 'duckdb' (default), 'postgres', 'spark', 'hive', 'trino', 'presto', 'snowflake', 'bigquery', 'redshift', 'mysql', 'mariadb' VARCHAR GLOBAL []