Read-only access to Salesforce orgs as DuckDB SQL tables over the official REST and Bulk APIs — OAuth refresh-token or JWT-bearer auth (credentials from inline options, environment variables, or an SFDX auth URL), native ATTACH, projection + predicate pushdown, COUNT pushdown, explicit server-side aggregates with GROUP BY, lazy/auto/Bulk transports with PK chunking (Bulk blob/base64 compatibility guard), a per-job API quota governor, parent + grandparent relationship STRUCT columns with diagnostics, queryAll (archived + deleted), Tooling-API fast schema, and metadata helpers (manual cache refresh, picklist values, record types).
Installing and Loading
INSTALL salesforce FROM community;
LOAD salesforce;
Example
-- Published build is signed: INSTALL salesforce FROM community; LOAD salesforce;
-- Connect with an OAuth refresh-token Connected App.
-- Set SF_CLIENT_ID, SF_CLIENT_SECRET, SF_REFRESH_TOKEN, and optionally SF_LOGIN_URL.
ATTACH 'salesforce://myorg' AS sf (TYPE salesforce, auth_source 'env');
SELECT Id, Name FROM sf.Account WHERE Name = 'Acme' LIMIT 10;
About salesforce
duckdb-salesforce attaches a Salesforce org as a read-only DuckDB catalog. Tables map to sObjects; SELECT runs over REST /query (or /queryAll for archived + soft-deleted rows), Bulk API 2.0 (lazy-streamed, optional parallel PK chunking), or an auto-selected transport, with SOQL projection + predicate pushdown and COUNT pushdown. salesforce_aggregate() runs explicit server-side SOQL aggregates (MIN/MAX/SUM/AVG/COUNT, optional filter and GROUP BY) without dragging rows down, and salesforce_relationships() reports parent/grandparent expansion. Read-only metadata helpers are available too: salesforce_refresh_metadata() (manual per-ATTACH cache refresh), salesforce_picklist_values(), and salesforce_record_types(). A Bulk blob/base64 compatibility guard keeps incompatible scans off Bulk, and blob bodies / epoch datetimes have clear documented limitations. Opt-in parent- and grandparent-relationship STRUCT columns, a per-job API quota governor, and Tooling-API fast schema discovery round it out. Authentication is OAuth 2.0 refresh-token or JWT bearer, with credentials from inline options, environment variables, or an SFDX auth URL; credentials stay in memory and are never logged; TLS server-certificate verification is always on. Read-only: all mutating catalog operations throw.
Added Functions
| function_name | function_type | description | comment | examples |
|---|---|---|---|---|
| salesforce_aggregate | table | NULL | NULL | |
| salesforce_decode | table | NULL | NULL | |
| salesforce_describe | table | NULL | NULL | |
| salesforce_describe_calls | table | NULL | NULL | |
| salesforce_global_describe_calls | table | NULL | NULL | |
| salesforce_last_bulk_create_body | table | NULL | NULL | |
| salesforce_last_quota | table | NULL | NULL | |
| salesforce_last_scan_pages | table | NULL | NULL | |
| salesforce_last_soql | table | NULL | NULL | |
| salesforce_last_transport | table | NULL | NULL | |
| salesforce_picklist_values | table | NULL | NULL | |
| salesforce_query | table | NULL | NULL | |
| salesforce_query_cost | table | NULL | NULL | |
| salesforce_record_types | table | NULL | NULL | |
| salesforce_refresh_metadata | table | NULL | NULL | |
| salesforce_relationships | table | NULL | NULL | |
| salesforce_tooling_calls | table | NULL | NULL | |
| sf_url_encode | scalar | NULL | NULL |
Overloaded Functions
This extension does not add any function overloads.
Added Types
This extension does not add any types.
Added Settings
| name | description | input_type | scope | aliases |
|---|---|---|---|---|
| sf_auto_bulk_threshold | For sf_force_transport='auto': estimated row count above which Bulk is chosen over REST (default 50000). | BIGINT | GLOBAL | [] |
| sf_auto_probe | For sf_force_transport='auto': run the COUNT() row-count probe (default true). When false, 'auto' always resolves to REST. | BOOLEAN | GLOBAL | [] |
| sf_bulk_chunks | Bulk PK chunking: split a Bulk scan into N disjoint Id ranges (1 = off, default; capped at 8). Sequential in this cut. Bulk transport only. | BIGINT | GLOBAL | [] |
| sf_force_transport | Scan transport: 'rest' (default), 'bulk' (Bulk API 2.0), or 'auto' (choose by row-count probe). 'bulk' is for large extractions / CREATE TABLE AS / COPY; same SOQL (projection + predicate pushdown) either way. | VARCHAR | GLOBAL | [] |
| sf_mock_bulk_create_body | TEST ONLY. Bulk job-create response body. | VARCHAR | GLOBAL | [] |
| sf_mock_bulk_create_status | TEST ONLY. Bulk job-create HTTP status. | BIGINT | GLOBAL | [] |
| sf_mock_bulk_results_body | TEST ONLY. Bulk results CSV page(s) ('|~|' per page). | VARCHAR | GLOBAL | [] |
| sf_mock_bulk_results_locator | TEST ONLY. Sforce-Locator per results page (comma-separated; empty = last). | VARCHAR | GLOBAL | [] |
| sf_mock_bulk_results_status | TEST ONLY. Bulk results HTTP status(es), CSV. | VARCHAR | GLOBAL | [] |
| sf_mock_bulk_status_body | TEST ONLY. Bulk status body/bodies ('|~|' per poll). | VARCHAR | GLOBAL | [] |
| sf_mock_bulk_status_code | TEST ONLY. Bulk status HTTP status(es), CSV. | VARCHAR | GLOBAL | [] |
| sf_mock_count_body | TEST ONLY. Body for the mocked COUNT() probe (reads totalSize). | VARCHAR | GLOBAL | [] |
| sf_mock_count_status | TEST ONLY. Statuses for the mocked COUNT() probe GET. | VARCHAR | GLOBAL | [] |
| sf_mock_describe_body | TEST ONLY. Bodies for mocked describe GETs ('|~|'-separated). | VARCHAR | GLOBAL | [] |
| sf_mock_describe_status | TEST ONLY. Statuses for mocked describe GETs (e.g. '200', '401,200'). | VARCHAR | GLOBAL | [] |
| sf_mock_env | TEST ONLY. Override environment-variable lookup for auth_source env/sfdx_url ('NAME=value;…'). Empty uses the real OS environment. | VARCHAR | GLOBAL | [] |
| sf_mock_limits_body | TEST ONLY. Body for the mocked GET /limits. | VARCHAR | GLOBAL | [] |
| sf_mock_limits_status | TEST ONLY. Statuses for the mocked GET /limits. | VARCHAR | GLOBAL | [] |
| sf_mock_query_body | TEST ONLY. Bodies for mocked query GET pages ('|~|'-separated). | VARCHAR | GLOBAL | [] |
| sf_mock_query_status | TEST ONLY. Statuses for mocked query GETs (e.g. '200,200'). | VARCHAR | GLOBAL | [] |
| sf_mock_queryall_body | TEST ONLY. Body/pages for the mocked GET /queryAll ('|~|'). | VARCHAR | GLOBAL | [] |
| sf_mock_queryall_status | TEST ONLY. Statuses for the mocked GET /queryAll. | VARCHAR | GLOBAL | [] |
| sf_mock_sobjects_body | TEST ONLY. Body for mocked global describe (GET /sobjects). | VARCHAR | GLOBAL | [] |
| sf_mock_sobjects_status | TEST ONLY. Statuses for mocked global describe (GET /sobjects). | VARCHAR | GLOBAL | [] |
| sf_mock_token_body | TEST ONLY. Response body paired with sf_mock_token_status. | VARCHAR | GLOBAL | [] |
| sf_mock_token_status | TEST ONLY. HTTP status for a mocked Salesforce token-endpoint response. 0 disables the mock and uses the live transport (default). | BIGINT | GLOBAL | [] |
| sf_mock_tooling_body | TEST ONLY. Body/pages for the mocked GET /tooling/query ('|~|'). | VARCHAR | GLOBAL | [] |
| sf_mock_tooling_status | TEST ONLY. Statuses for the mocked GET /tooling/query. | VARCHAR | GLOBAL | [] |
| sf_query_mode | Read mode: 'query' (default) or 'queryAll' (also returns archived + soft-deleted records). Affects the scan (REST + Bulk) and its probes. | VARCHAR | GLOBAL | [] |
| sf_quota_cache_seconds | Quota governor: in-memory TTL for a cached /limits snapshot, per instance_url (default 60; 0 disables caching). | BIGINT | GLOBAL | [] |
| sf_quota_enabled | Quota governor: gate Bulk job starts on the org's API quota (default true). false skips /limits and never blocks. | BOOLEAN | GLOBAL | [] |
| sf_quota_enforce | Quota governor: block when below reserve (default true). false = consult /limits and report, but proceed (warn-only). | BOOLEAN | GLOBAL | [] |
| sf_quota_fail_open | Quota governor: when /limits is unavailable, allow the Bulk job (default true). false blocks with a clear error. | BOOLEAN | GLOBAL | [] |
| sf_quota_min_remaining | Quota governor: absolute floor of remaining DailyApiRequests below which Bulk is refused (default 1000). | BIGINT | GLOBAL | [] |
| sf_quota_reserve_pct | Quota governor: keep this %% of DailyApiRequests.Max in reserve (default 10). | BIGINT | GLOBAL | [] |
| sf_relationship_depth | Parent traversal depth when sf_relationships='parent': 1 (default, parent only) or 2 (also grandparent, nested STRUCT). Capped at 2. | BIGINT | GLOBAL | [] |
| sf_relationships | Parent relationship traversal: 'off' (default) or 'parent' (expose each single-target parent as a STRUCT column, e.g. SELECT Account.Name FROM sf.Contact). Polymorphic/child relationships not expanded. | VARCHAR | GLOBAL | [] |
| sf_schema_source | Schema discovery: 'describe' (default, REST, authoritative) or 'tooling' (fast batched Tooling API FieldDefinition; falls back to REST describe per object on error/absent/ambiguous type; coarser types; fields default non-filterable unless Tooling marks them filterable). | VARCHAR | GLOBAL | [] |