⌘+k ctrl+k
1.5 (current)
Search Shortcut cmd + k | ctrl + k
Delta Extension

The delta extension adds support for the Delta Lake open-source storage format. It is built using the Delta Kernel. The extension offers read and write support for Delta tables, both local and remote.

For implementation details, see the announcement blog post.

Warning We are aware of a regression in Azure Onelake which appears to be a consequence of a change in delta-kernel-rs. You can track the issue on GitHub.

To connect to Unity Catalog, DuckDB has the unity_catalog extension.

Installing and Loading

The delta extension will be transparently autoloaded on first use from the official extension repository. If you would like to install and load it manually, run:

INSTALL delta;
LOAD delta;

Usage

To scan a local Delta table, run:

SELECT *
FROM delta_scan('file:///some/path/on/local/machine');

Reading from an S3 Bucket

To scan a Delta table in an S3 bucket, run:

SELECT *
FROM delta_scan('s3://some/delta/table');

For authenticating to S3 buckets, DuckDB Secrets are supported:

CREATE SECRET (
    TYPE s3,
    PROVIDER credential_chain
);
SELECT *
FROM delta_scan('s3://some/delta/table/with/auth');

To scan public buckets on S3, you may need to pass the correct region by creating a secret containing the region of your public S3 bucket:

CREATE SECRET (
    TYPE s3,
    REGION 'my-region'
);
SELECT *
FROM delta_scan('s3://some/public/table/in/my-region');

Reading from Azure Blob Storage

To scan a Delta table in an Azure Blob Storage bucket, run:

SELECT *
FROM delta_scan('az://my-container/my-table');

For authenticating to Azure Blob Storage, DuckDB Secrets are supported:

CREATE SECRET (
    TYPE azure,
    PROVIDER credential_chain
);
SELECT *
FROM delta_scan('az://my-container/my-table-with-auth');

Reading from Google Cloud Storage

To scan a Delta table in a GCS bucket, use HMAC keys and create a secret:

CREATE SECRET (
    TYPE gcs,
    KEY_ID 'hmac-key-id',
    SECRET 'hmac-secret'
);
SELECT *
FROM delta_scan('gs://my-bucket/my-delta-table');

Appending Data

To append rows to a Delta table, attach it and use INSERT INTO:

ATTACH 's3://my-bucket/my-delta-table' AS my_table (TYPE delta);
INSERT INTO my_table SELECT * FROM other_table;

Time Travel

To read a specific version of a Delta table, attach it and use the AT (VERSION => n) clause:

ATTACH 's3://my-bucket/my-delta-table' AS my_table (TYPE delta);
SELECT * FROM my_table AT (VERSION => 5);

Alternatively, pin a version at attach time:

ATTACH 's3://my-bucket/my-delta-table' AS my_table (TYPE delta, VERSION 5);

Checkpointing

To compact the Delta log of an attached table into a checkpoint file:

ATTACH 'path/to/my-delta-table' AS my_table (TYPE delta);
CHECKPOINT my_table;

Idempotent Appends

The delta extension exposes an idempotent-append API that uses Delta's per-application transaction versions to give exactly-once semantics: an append tagged with an app_id and a version is only committed if the table's currently recorded version for that app_id matches the expected previous version. This lets a producer safely retry a batch without duplicating it.

Within a transaction, tag the append with delta_set_transaction_version(⟨table⟩, ⟨app_id⟩, ⟨new_version⟩, ⟨expected_previous_version⟩):

ATTACH 'path/to/my-delta-table' AS my_table (TYPE delta);

BEGIN TRANSACTION;
CALL delta_set_transaction_version('my_table', 'my_app_id', 1::UBIGINT, NULL::UBIGINT);
INSERT INTO my_table VALUES (1);
COMMIT;

On COMMIT, the version is compared-and-swapped: if another process advanced the version for my_app_id in the meantime, the commit fails. Aborting the transaction leaves the version unchanged. Read the current version with delta_get_transaction_version(⟨table⟩, ⟨app_id⟩), which returns NULL if no version has been recorded yet.

Attach Options

When attaching a Delta table you can pass the following options to ATTACH:

Option Type Default Description
VERSION UBIGINT latest Pin the attached table to a specific table version.
PIN_SNAPSHOT BOOLEAN false Resolve the table snapshot once at attach time and reuse it, rather than re-resolving the latest version per query.
PUSHDOWN_PARTITION_INFO BOOLEAN true Push down partition information so that whole files can be skipped based on partition values.
PUSHDOWN_FILTERS VARCHAR all Filter pushdown mode for file skipping. One of none, all, constant_only, dynamic_only.
ATTACH 's3://my-bucket/my-delta-table' AS my_table (
    TYPE delta,
    PIN_SNAPSHOT true,
    PUSHDOWN_FILTERS 'constant_only'
);

Inspecting Scanned Files

delta_list_files returns the data files a scan would read for a table, together with their cardinality, partition values, and whether they carry deletion vectors. This is useful for understanding the effect of data skipping:

SELECT * FROM delta_list_files('file:///some/path/on/local/machine');
Column Type Description
data_file VARCHAR Path to the Parquet data file.
cardinality UBIGINT Number of rows in the file.
partitions MAP(VARCHAR, VARCHAR) Partition column values for the file.
have_deletes BOOLEAN Whether the file has an associated deletion vector.

Credential Chains in Delta

DuckDB Delta uses delta-kernel-rs and object_store for some network operations. These systems have a different ordering (and inclusion defaults) for credential chains. If your system has multiple credential sources available, e.g., both Service Principal via the environment and a CLI-based option, credential loading behavior may be inconsistent.

To avoid ambiguities, we recommend that you configure exactly one available credential type in your production chain secrets.

Settings

The delta extension adds the following settings:

Setting Type Default Description
delta_kernel_logging BOOLEAN false Forward the internal logging of the Delta Kernel to the DuckDB logger. May impact performance even when DuckDB logging is disabled.
delta_scan_explain_files_filtered BOOLEAN true Add the filtered files to the EXPLAIN output. May impact the performance of delta_scan during EXPLAIN ANALYZE queries.

Features

The delta extension supports:

  • multithreaded scans and Parquet metadata reading
  • data skipping/filter pushdown
    • skipping row groups in file (based on Parquet metadata)
    • skipping complete files (based on Delta partition information)
  • projection pushdown
  • scanning tables with deletion vectors
  • all primitive types
  • structs
  • VARIANT type
  • blind appends (INSERT INTO)
  • cloud storage (AWS S3, Azure, GCS) with secrets

Supported Platforms

The delta extension currently only supports the following platforms:

  • Linux AMD64 (x86_64 and ARM64): linux_amd64 and linux_arm64
  • macOS Intel and Apple Silicon: osx_amd64 and osx_arm64
  • Windows AMD64: windows_amd64

Support for the other DuckDB platforms is work-in-progress.

Using delta-rs with DuckDB

In this example, we create a Delta table with the delta-rs Python binding, then we use the delta extension of DuckDB to read it. We also showcase how to do other read operations with DuckDB, like reading the change data feed using the Arrow zero-copy integration. This operation can also be lazy if reading bigger data by using Arrow Datasets.

Click here to see the full example.
import deltalake as dl
import pyarrow as pa

# Create a delta table and read it with DuckDB Delta extension
dl.write_deltalake(
    "tmp/some_table",
    pa.table({
        "id": [1, 2, 3],
        "value": ["a", "b", "c"]
    })
)
with duckdb.connect() as conn:
    conn.execute("""
        INSTALL delta;
        LOAD delta;
    """)
    conn.sql("""
        SELECT * FROM delta_scan('tmp/some_table')
    """).show()

# Append some data and read the data change feed using the PyArrow integration
dl.write_deltalake(
    "tmp/some_table",
    pa.table({
        "id": [4, 5],
        "value": ["d", "e"]
    }),
    mode="append"
)
table = dl.DeltaTable("tmp/some_table").load_cdf(starting_version=1, ending_version=2)
with duckdb.connect() as conn:
    conn.register("t", table)
    conn.sql("SELECT * FROM t").show()
© 2026 DuckDB Foundation, Amsterdam NL
DuckDB Home Code of Conduct Trademark Use Blog