Lance Extension

Search Shortcut cmd + k | ctrl + k

Documentation / Core Extensions

Lance Extension

The lance extension adds support for reading and writing Lance tables. Lance is a modern lakehouse format optimized for ML/AI workloads, with native cloud storage support.

Installing and Loading

You can install the lance extension from DuckDB's core extensions repository and load it using the following commands:

INSTALL lance;
LOAD lance;

Usage

Query a Lance Dataset

Local file:

SELECT *
FROM 'path/to/dataset.lance'
LIMIT 10;

S3:

SELECT *
FROM 's3://bucket/path/to/dataset.lance'
LIMIT 10;

To access object store URIs (e.g., s3://...), configure a TYPE lance secret using the Secrets Manager:

CREATE SECRET (
    TYPE lance,
    PROVIDER credential_chain,
    SCOPE 's3://bucket/'
);

SELECT *
FROM 's3://bucket/path/to/dataset.lance'
LIMIT 10;

Write a Lance Dataset

Use the COPY ... TO ... statement to materialize query results as a Lance dataset.

-- Create/overwrite a Lance dataset from a query
COPY (
    SELECT 1::BIGINT AS id, 'a'::VARCHAR AS s
    UNION ALL
    SELECT 2::BIGINT AS id, 'b'::VARCHAR AS s
) TO 'path/to/out.lance' (
    FORMAT lance,
    MODE 'overwrite'
);

-- Read it back via the replacement scan
SELECT count(*) FROM 'path/to/out.lance';

-- Append more rows to an existing dataset
COPY (
    SELECT 3::BIGINT AS id, 'c'::VARCHAR AS s
) TO 'path/to/out.lance' (
    FORMAT lance,
    MODE 'append'
);

-- Optionally create an empty dataset (schema only)
COPY (
    SELECT 1::BIGINT AS id, 'x'::VARCHAR AS s
    WITH NO DATA
) TO 'path/to/empty.lance' (
    FORMAT lance,
    MODE 'overwrite',
    WRITE_EMPTY_FILE true
);

To write to s3://... paths, configure a TYPE lance secret for that scope using the Secrets Manager:

CREATE SECRET (
    TYPE lance,
    PROVIDER credential_chain,
    SCOPE 's3://bucket/'
);

COPY (SELECT 1 AS id)
TO 's3://bucket/path/to/out.lance'
(FORMAT lance, MODE 'overwrite');

Create a Lance Dataset via `CREATE TABLE` (Directory Namespace)

When you ATTACH a directory as a Lance namespace, you can create new datasets using CREATE TABLE (schema-only) or CREATE TABLE AS SELECT (CTAS). The dataset is written to namespace_root/table_name.lance.

ATTACH 'path/to/dir' AS lance_ns (TYPE lance);

-- Schema-only (creates an empty dataset)
CREATE TABLE lance_ns.main.my_empty (id BIGINT, s VARCHAR);

-- CTAS (writes query results)
CREATE TABLE lance_ns.main.my_dataset AS
    SELECT 1::BIGINT AS id, 'a'::VARCHAR AS s
    UNION ALL
    SELECT 2::BIGINT AS id, 'b'::VARCHAR AS s;

SELECT count(*) FROM lance_ns.main.my_dataset;

Vector Search

-- Search a vector column, returning distances in `_distance` (smaller is closer)
SELECT id, label, _distance
FROM lance_vector_search(
    'path/to/dataset.lance', 'vec',
    [0.1, 0.2, 0.3, 0.4]::FLOAT[4],
    k = 5,
    prefilter = true
)
ORDER BY _distance ASC;

See the SQL reference for full parameter documentation.

Full-Text Search (FTS)

-- Search a text column, returning BM25-like scores in `_score`
SELECT id, text, _score
FROM lance_fts(
    'path/to/dataset.lance',
    'text',
    'puppy',
    k = 10,
    prefilter = true
)
ORDER BY _score DESC;

See the SQL reference for full parameter documentation.

Hybrid Search (Vector + FTS)

-- Combine vector and text scores, returning `_hybrid_score` in addition to `_distance` / `_score`
SELECT id, _hybrid_score, _distance, _score
FROM lance_hybrid_search('path/to/dataset.lance',
                         'vec', [0.1, 0.2, 0.3, 0.4]::FLOAT[4],
                         'text', 'puppy',
                         k = 10, prefilter = false,
                         alpha = 0.5, oversample_factor = 4)
ORDER BY _hybrid_score DESC;

Limitations

The lance extension is currently available for the following platforms:

linux_amd64
linux_arm64
osx_arm64
windows_amd64

About this page

Code of Conduct Trademark Use

About this page

In this article