Integrates DuckDB with Google BigQuery, allowing direct querying and management of BigQuery datasets
Maintainer(s):
hafenkran
Installing and Loading
INSTALL bigquery FROM community;
LOAD bigquery;
Example
-- Attach to your BigQuery Project
D ATTACH 'project=my_gcp_project' AS bq (TYPE bigquery, READ_ONLY);
-- Show all tables in all datasets in the attached BigQuery project
D SHOW ALL TABLES;
┌──────────┬──────────────────┬──────────┬──────────────┬───────────────────┬───────────┐
│ database │ schema │ name │ column_names │ column_types │ temporary │
│ varchar │ varchar │ varchar │ varchar[] │ varchar[] │ boolean │
├──────────┼──────────────────┼──────────┼──────────────┼───────────────────┼───────────┤
│ bq │ quacking_dataset │ duck_tbl │ [i, s] │ [BIGINT, VARCHAR] │ false │
| bq | barking_dataset | dog_tbl | [i, s] | [BIGINT, VARCHAR] │ false |
└──────────┴──────────────────┴──────────┴──────────────┴───────────────────┴───────────┘
-- Select data from a specific table in BigQuery
D SELECT * FROM bq.quacking_dataset.duck_tbl;
┌───────┬────────────────┐
│ i │ s │
│ int32 │ varchar │
├───────┼────────────────┤
│ 12 │ quack 🦆 │
│ 13 │ quack quack 🦆 │
└───────┴────────────────┘
About bigquery
The DuckDB BigQuery Extension integrates DuckDB with Google BigQuery, allowing direct querying and management of BigQuery datasets. For detailed setup and usage instructions, visit the extension repository.
Added Functions
function_name | function_type | description | comment | examples |
---|---|---|---|---|
bigquery_attach | table | Attach to a BigQuery project. | NULL | [ATTACH 'project=my_gcp_project' as bq (TYPE bigquery);] |
bigquery_scan | table | Scan a single table directly from BigQuery. | NULL | [SELECT * FROM bigquery_scan('my_gcp_project.quacking_dataset.duck_tbl');] |
bigquery_arrow_scan | table | Scan a single table directly from BigQuery (more efficient reimplementation). | NULL | [SELECT * FROM bigquery_arrow_scan('my_gcp_project.quacking_dataset.duck_tbl');] |
bigquery_query | table | Run a custom GoogleSQL query in BigQuery and read the results. | NULL | [SELECT * FROM bigquery_query('bq', 'SELECT * FROM quacking_dataset.duck_tbl WHERE duck_id = 123');] |
bigquery_execute | table | Execute an arbitrary GoogleSQL query in BigQuery. | NULL | [CALL bigquery_execute('bq', 'CREATE SCHEMA deluxe_dataset OPTIONS(location="us", default_table_expiration_days=3.75);')] |
bigquery_jobs | table | List jobs in a BigQuery project. | NULL | [SELECT * FROM bigquery_jobs('bq');] |
bigquery_clear_cache | table | Clear the internal caches to refetch the most current project information from BigQuery. | NULL | [CALL bigquery_clear_cache();] |
Added Settings
name | description | input_type | scope |
---|---|---|---|
bq_arrow_compression | Compression codec for BigQuery Storage Read API. Options: UNSPECIFIED, LZ4_FRAME, ZSTD.Default is LZ4_FRAME. | VARCHAR | GLOBAL |
bq_bignumeric_as_varchar | Read BigQuery BIGNUMERIC data type as VARCHAR | BOOLEAN | GLOBAL |
bq_curl_ca_bundle_path | Path to the CA bundle for curl | VARCHAR | GLOBAL |
bq_debug_show_queries | DEBUG SETTING: print all queries sent to BigQuery to stdout | BOOLEAN | GLOBAL |
bq_default_location | Default location for BigQuery queries | VARCHAR | GLOBAL |
bq_experimental_enable_bigquery_options | Whether to enable BigQuery OPTIONS in CREATE statements | BOOLEAN | GLOBAL |
bq_experimental_filter_pushdown | Whether to use filter pushdown (currently experimental) | BOOLEAN | GLOBAL |
bq_experimental_use_incubating_scan | Whether to use the incubating BigQuery scan implementation. This is currently experimental and is targeted to become the default in the future. | BOOLEAN | GLOBAL |
bq_experimental_use_info_schema | Whether to fetch table infos from BQ information schema (currently experimental). Can be significantly faster than fetching from REST API. | BOOLEAN | GLOBAL |
bq_max_read_streams | Maximum number of read streams for BigQuery Storage Read. Set to 0 to automatically match the number of DuckDB threads. preserve_insertion_order must be false for parallelization to work. |
BIGINT | GLOBAL |
bq_query_timeout_ms | Timeout for BigQuery queries in milliseconds | BIGINT | GLOBAL |