Integrates DuckDB with Google BigQuery, allowing direct querying and management of BigQuery datasets
Maintainer(s):
hafenkran
Installing and Loading
INSTALL bigquery FROM community;
LOAD bigquery;
Example
-- Attach to your BigQuery Project
D ATTACH 'project=my_gcp_project' AS bq (TYPE bigquery, READ_ONLY);
-- Show all tables in all datasets in the attached BigQuery project
D SHOW ALL TABLES;
┌──────────┬──────────────────┬──────────┬──────────────┬───────────────────┬───────────┐
│ database │ schema │ name │ column_names │ column_types │ temporary │
│ varchar │ varchar │ varchar │ varchar[] │ varchar[] │ boolean │
├──────────┼──────────────────┼──────────┼──────────────┼───────────────────┼───────────┤
│ bq │ quacking_dataset │ duck_tbl │ [i, s] │ [BIGINT, VARCHAR] │ false │
| bq | barking_dataset | dog_tbl | [i, s] | [BIGINT, VARCHAR] │ false │
└──────────┴──────────────────┴──────────┴──────────────┴───────────────────┴───────────┘
-- Select data from a specific table in BigQuery
D SELECT * FROM bq.quacking_dataset.duck_tbl;
┌───────┬────────────────┐
│ i │ s │
│ int32 │ varchar │
├───────┼────────────────┤
│ 12 │ quack 🦆 │
│ 13 │ quack quack 🦆 │
└───────┴────────────────┘
About bigquery
This community-maintained extension allows DuckDB to connect to Google BigQuery using the BigQuery Storage (read/write) and REST APIs. It enables users to read, write, and manage their BigQuery datasets/tables directly from DuckDB using standard SQL queries. For detailed setup and usage instructions, visit the extension repository.
Added Functions
| function_name | function_type | description | comment | examples |
|---|---|---|---|---|
| bigquery_attach | table | Attach to a BigQuery project. | NULL | [ATTACH 'project=my_gcp_project' as bq (TYPE bigquery);] |
| bigquery_scan | table | Scan a single table directly from BigQuery. | NULL | [SELECT * FROM bigquery_scan('my_gcp_project.quacking_dataset.duck_tbl');] |
| bigquery_query | table | Run a custom GoogleSQL query in BigQuery and read the results. | NULL | [SELECT * FROM bigquery_query('bq', 'SELECT * FROM quacking_dataset.duck_tbl WHERE duck_id = 123');] |
| bigquery_execute | table | Execute an arbitrary GoogleSQL query in BigQuery. | NULL | [CALL bigquery_execute('bq', 'CREATE SCHEMA deluxe_dataset OPTIONS(location="us", default_table_expiration_days=3.75);')] |
| bigquery_jobs | table | List jobs in a BigQuery project. | NULL | [SELECT * FROM bigquery_jobs('bq');] |
| bigquery_clear_cache | table | Clear the internal caches to refetch the most current project information from BigQuery. | NULL | [CALL bigquery_clear_cache();] |
| bigquery_load | table | NULL | NULL | NULL |
| bigquery_normalize_geography | scalar | NULL | NULL | NULL |
Overloaded Functions
This extension does not add any function overloads.
Added Types
This extension does not add any types.
Added Settings
| name | description | input_type | scope | aliases |
|---|---|---|---|---|
| bq_arrow_compression | Compression codec for BigQuery Storage Read API. Options: UNSPECIFIED, LZ4_FRAME, ZSTD.Default is LZ4_FRAME. | VARCHAR | GLOBAL | [] |
| bq_bignumeric_as_varchar | Read BigQuery BIGNUMERIC data type as VARCHAR | BOOLEAN | GLOBAL | [] |
| bq_curl_ca_bundle_path | Path to the CA bundle for curl | VARCHAR | GLOBAL | [] |
| bq_debug_show_queries | DEBUG SETTING: print all queries sent to BigQuery to stdout | BOOLEAN | GLOBAL | [] |
| bq_default_location | Default location for BigQuery queries | VARCHAR | GLOBAL | [] |
| bq_enable_inflight_request_windowing | Whether to allow multiple BigQuery Storage Write AppendRows requests to remain in flight before waiting for acknowledgements. Usually faster, but slightly less memory efficient because more unacknowledged requests can be buffered at once. | BOOLEAN | GLOBAL | [] |
| bq_experimental_enable_sql_parser | Whether to enable BigQuery CREATE TABLE clause parsing extensions | BOOLEAN | GLOBAL | [] |
| bq_experimental_filter_pushdown | Whether to use filter pushdown (currently experimental) | BOOLEAN | GLOBAL | [] |
| bq_experimental_use_info_schema | Whether to fetch table infos from BQ information schema (currently experimental). Can be significantly faster than fetching from REST API. | BOOLEAN | GLOBAL | [] |
| bq_max_read_streams | Maximum number of read streams for BigQuery Storage Read. Set to 0 to automatically match the number of DuckDB threads. preserve_insertion_order must be false for parallelization to work. |
BIGINT | GLOBAL | [] |
| bq_query_timeout_ms | Timeout for BigQuery queries in milliseconds | BIGINT | GLOBAL | [] |