This extension allows you to connect DuckDB to Microsoft Fabric OneLake workspaces and lakehouses, enabling you to query data stored in OneLake directly from DuckDB.
Maintainer(s):
achrafcei
Installing and Loading
INSTALL onelake FROM community;
LOAD onelake;
Example
-- First, create a secret to authenticate with OneLake.
-- You can use either a service principal or the credential chain (e.g., Azure CLI).
-- Uncomment the desired method and provide the necessary credentials.
set azure_transport_option_type = 'curl';
CREATE SECRET (
TYPE azure,
PROVIDER service_principal,
TENANT_ID '<your_tenant_id>',
CLIENT_ID '<your_client_id>',
CLIENT_SECRET '<your_client_secret>'
);
-- CREATE SECRET (
-- TYPE azure,
-- PROVIDER credential_chain,
-- CHAIN 'cli'
-- );
CREATE SECRET onelake (
TYPE ONELAKE,
TENANT_ID '<your_tenant_id>',
CLIENT_ID '<your_client_id>',
CLIENT_SECRET '<your_client_secret>'
);
-- CREATE SECRET onelake(
-- TYPE ONELAKE,
-- PROVIDER credential_chain,
-- CHAIN 'cli'
-- );
-- Attach to your OneLake workspace and lakehouse
-- You can provide either IDs or names for the workspace and lakehouse.
-- If you wish to only attach to a workspace without specifying a default lakehouse, omit the DEFAULT_LAKEHOUSE parameter.
-- If you however wish to only connect to a specific lakehouse run only the second ATTACH command below.
ATTACH 'onelake://<your_workspace_id>'
AS <your_connection_name>
(TYPE ONELAKE, DEFAULT_LAKEHOUSE '<your_lakehouse_id_or_name>');
-- Alternatively, provide workspace and lakehouse names and let the extension resolve them:
ATTACH '<your_workspace_name>/<your_lakehouse_name>.Lakehouse'
AS <your_connection_name>
(TYPE ONELAKE);
USE <your_connection_name>;
SHOW TABLES;
SELECT * FROM <your_table_name> LIMIT 10 ; -- USING ICEBERG;
About onelake
This extension enables DuckDB to connect to Microsoft Fabric OneLake workspaces and lakehouses, allowing users to query data stored in OneLake directly from DuckDB. It supports authentication via service principals or credential chains (e.g., Azure CLI) and provides seamless integration with OneLake's data storage capabilities. For detailed setup and usage instructions, visit the extension repository. Current limitations:
- Only read access is supported; write operations are not yet implemented.
- Schema enabled lakehouses are not supported.
Added Functions
| function_name | function_type | description | comment | examples |
|---|---|---|---|---|
| copy_dir | table_macro | NULL | NULL | |
| delta_filter_pushdown_log | table_macro | NULL | NULL | |
| delta_filter_pushdown_log_tpcds | table_macro | NULL | NULL | |
| delta_list_files | table | NULL | NULL | |
| delta_scan | table | NULL | NULL | |
| get_delta_test_expression | scalar | NULL | NULL | |
| parse_delta_filter_logline | macro | NULL | NULL | |
| write_blob | scalar | NULL | NULL |
Added Settings
| name | description | input_type | scope | aliases |
|---|---|---|---|---|
| auto_fallback_to_full_download | Allows automatically falling back to full file downloads when possible. | BOOLEAN | GLOBAL | [] |
| ca_cert_file | Path to a custom certificate file for self-signed certificates. | VARCHAR | GLOBAL | [] |
| delta_kernel_logging | Forwards the internal logging of the Delta Kernel to the duckdb logger. Warning: this may impact performance even with DuckDB logging disabled. | BOOLEAN | GLOBAL | [] |
| delta_scan_explain_files_filtered | Adds the filtered files to the explain output. Warning: this may impact performance of delta scan during explain analyze queries. | BOOLEAN | GLOBAL | [] |
| enable_curl_server_cert_verification | Enable server side certificate verification for CURL backend. | BOOLEAN | GLOBAL | [] |
| enable_server_cert_verification | Enable server side certificate verification. | BOOLEAN | GLOBAL | [] |
| force_download | Forces upfront download of file | BOOLEAN | GLOBAL | [] |
| hf_max_per_page | Debug option to limit number of items returned in list requests | UBIGINT | GLOBAL | [] |
| http_keep_alive | Keep alive connections. Setting this to false can help when running into connection failures | BOOLEAN | GLOBAL | [] |
| http_retries | HTTP retries on I/O error | UBIGINT | GLOBAL | [] |
| http_retry_backoff | Backoff factor for exponentially increasing retry wait time | FLOAT | GLOBAL | [] |
| http_retry_wait_ms | Time between retries | UBIGINT | GLOBAL | [] |
| http_timeout | HTTP timeout read/write/connection/retry (in seconds) | UBIGINT | GLOBAL | [] |
| httpfs_client_implementation | Select which is the HTTPUtil implementation to be used | VARCHAR | GLOBAL | [] |
| s3_access_key_id | S3 Access Key ID | VARCHAR | GLOBAL | [] |
| s3_endpoint | S3 Endpoint | VARCHAR | GLOBAL | [] |
| s3_kms_key_id | S3 KMS Key ID | VARCHAR | GLOBAL | [] |
| s3_region | S3 Region | VARCHAR | GLOBAL | [] |
| s3_requester_pays | S3 use requester pays mode | BOOLEAN | GLOBAL | [] |
| s3_secret_access_key | S3 Access Key | VARCHAR | GLOBAL | [] |
| s3_session_token | S3 Session Token | VARCHAR | GLOBAL | [] |
| s3_uploader_max_filesize | S3 Uploader max filesize (between 50GB and 5TB) | VARCHAR | GLOBAL | [] |
| s3_uploader_max_parts_per_file | S3 Uploader max parts per file (between 1 and 10000) | UBIGINT | GLOBAL | [] |
| s3_uploader_thread_limit | S3 Uploader global thread limit | UBIGINT | GLOBAL | [] |
| s3_url_compatibility_mode | Disable Globs and Query Parameters on S3 URLs | BOOLEAN | GLOBAL | [] |
| s3_url_style | S3 URL style | VARCHAR | GLOBAL | [] |
| s3_use_ssl | S3 use SSL | BOOLEAN | GLOBAL | [] |
| unsafe_disable_etag_checks | Disable checks on ETag consistency | BOOLEAN | GLOBAL | [] |