Search Shortcut cmd + k | ctrl + k
onelake

This extension allows you to connect DuckDB to Microsoft Fabric OneLake workspaces and lakehouses, enabling you to query data stored in OneLake directly from DuckDB.

Maintainer(s): achrafcei

Installing and Loading

INSTALL onelake FROM community;
LOAD onelake;

Example

-- First, create a secret to authenticate with OneLake.
-- You can use either a service principal or the credential chain (e.g., Azure CLI).
-- Uncomment the desired method and provide the necessary credentials.
set azure_transport_option_type = 'curl';
CREATE SECRET  (
    TYPE azure,
    PROVIDER service_principal,
    TENANT_ID '<your_tenant_id>',
    CLIENT_ID '<your_client_id>',
    CLIENT_SECRET '<your_client_secret>'
);
-- CREATE SECRET  (
--     TYPE azure,
--     PROVIDER credential_chain,
--     CHAIN 'cli'
-- );

CREATE SECRET onelake (
    TYPE ONELAKE,
    TENANT_ID '<your_tenant_id>',
    CLIENT_ID '<your_client_id>',
    CLIENT_SECRET '<your_client_secret>'
);

-- CREATE SECRET  onelake(
--     TYPE ONELAKE,
--     PROVIDER credential_chain,
--     CHAIN 'cli'
-- );

-- Attach to your OneLake workspace and lakehouse
-- You can provide either IDs or names for the workspace and lakehouse.
-- If you wish to only attach to a workspace without specifying a default lakehouse, omit the DEFAULT_LAKEHOUSE parameter.
-- If you however wish to only connect to a specific lakehouse run only the second ATTACH command below.

ATTACH 'onelake://<your_workspace_id>'
      AS <your_connection_name>
      (TYPE ONELAKE, DEFAULT_LAKEHOUSE '<your_lakehouse_id_or_name>');

-- Alternatively, provide workspace and lakehouse names and let the extension resolve them:
ATTACH '<your_workspace_name>/<your_lakehouse_name>.Lakehouse'
    AS <your_connection_name>
    (TYPE ONELAKE);

USE <your_connection_name>;

SHOW TABLES;

SELECT * FROM <your_table_name> LIMIT 10 ; -- USING ICEBERG;

About onelake

This extension enables DuckDB to connect to Microsoft Fabric OneLake workspaces and lakehouses, allowing users to query data stored in OneLake directly from DuckDB. It supports authentication via service principals or credential chains (e.g., Azure CLI) and provides seamless integration with OneLake's data storage capabilities. For detailed setup and usage instructions, visit the extension repository. Current limitations:

  • Only read access is supported; write operations are not yet implemented.
  • Schema enabled lakehouses are not supported.

Added Functions

function_name function_type description comment examples
copy_dir table_macro NULL NULL  
delta_filter_pushdown_log table_macro NULL NULL  
delta_filter_pushdown_log_tpcds table_macro NULL NULL  
delta_list_files table NULL NULL  
delta_scan table NULL NULL  
get_delta_test_expression scalar NULL NULL  
parse_delta_filter_logline macro NULL NULL  
write_blob scalar NULL NULL  

Added Settings

name description input_type scope aliases
auto_fallback_to_full_download Allows automatically falling back to full file downloads when possible. BOOLEAN GLOBAL []
ca_cert_file Path to a custom certificate file for self-signed certificates. VARCHAR GLOBAL []
delta_kernel_logging Forwards the internal logging of the Delta Kernel to the duckdb logger. Warning: this may impact performance even with DuckDB logging disabled. BOOLEAN GLOBAL []
delta_scan_explain_files_filtered Adds the filtered files to the explain output. Warning: this may impact performance of delta scan during explain analyze queries. BOOLEAN GLOBAL []
enable_curl_server_cert_verification Enable server side certificate verification for CURL backend. BOOLEAN GLOBAL []
enable_server_cert_verification Enable server side certificate verification. BOOLEAN GLOBAL []
force_download Forces upfront download of file BOOLEAN GLOBAL []
hf_max_per_page Debug option to limit number of items returned in list requests UBIGINT GLOBAL []
http_keep_alive Keep alive connections. Setting this to false can help when running into connection failures BOOLEAN GLOBAL []
http_retries HTTP retries on I/O error UBIGINT GLOBAL []
http_retry_backoff Backoff factor for exponentially increasing retry wait time FLOAT GLOBAL []
http_retry_wait_ms Time between retries UBIGINT GLOBAL []
http_timeout HTTP timeout read/write/connection/retry (in seconds) UBIGINT GLOBAL []
httpfs_client_implementation Select which is the HTTPUtil implementation to be used VARCHAR GLOBAL []
s3_access_key_id S3 Access Key ID VARCHAR GLOBAL []
s3_endpoint S3 Endpoint VARCHAR GLOBAL []
s3_kms_key_id S3 KMS Key ID VARCHAR GLOBAL []
s3_region S3 Region VARCHAR GLOBAL []
s3_requester_pays S3 use requester pays mode BOOLEAN GLOBAL []
s3_secret_access_key S3 Access Key VARCHAR GLOBAL []
s3_session_token S3 Session Token VARCHAR GLOBAL []
s3_uploader_max_filesize S3 Uploader max filesize (between 50GB and 5TB) VARCHAR GLOBAL []
s3_uploader_max_parts_per_file S3 Uploader max parts per file (between 1 and 10000) UBIGINT GLOBAL []
s3_uploader_thread_limit S3 Uploader global thread limit UBIGINT GLOBAL []
s3_url_compatibility_mode Disable Globs and Query Parameters on S3 URLs BOOLEAN GLOBAL []
s3_url_style S3 URL style VARCHAR GLOBAL []
s3_use_ssl S3 use SSL BOOLEAN GLOBAL []
unsafe_disable_etag_checks Disable checks on ETag consistency BOOLEAN GLOBAL []