Search Shortcut cmd + k | ctrl + k
bigquery

Integrates DuckDB with Google BigQuery, allowing direct querying and management of BigQuery datasets

Maintainer(s): hafenkran

Installing and Loading

INSTALL bigquery FROM community;
LOAD bigquery;

Example

-- Attach to your BigQuery Project
D ATTACH 'project=my_gcp_project' AS bq (TYPE bigquery, READ_ONLY);

-- Show all tables in all datasets in the attached BigQuery project
D SHOW ALL TABLES;
┌──────────┬──────────────────┬──────────┬──────────────┬───────────────────┬───────────┐
 database       schema         name    column_names    column_types     temporary 
 varchar       varchar        varchar   varchar[]        varchar[]       boolean  
├──────────┼──────────────────┼──────────┼──────────────┼───────────────────┼───────────┤
 bq        quacking_dataset  duck_tbl  [i, s]        [BIGINT, VARCHAR]  false     
| bq       | barking_dataset  | dog_tbl  | [i, s]       | [BIGINT, VARCHAR]  false     
└──────────┴──────────────────┴──────────┴──────────────┴───────────────────┴───────────┘

-- Select data from a specific table in BigQuery
D SELECT * FROM bq.quacking_dataset.duck_tbl;
┌───────┬────────────────┐
   i          s        
 int32     varchar     
├───────┼────────────────┤
    12  quack 🦆       
    13  quack quack 🦆 
└───────┴────────────────┘

About bigquery

This community-maintained extension allows DuckDB to connect to Google BigQuery using the BigQuery Storage (read/write) and REST APIs. It enables users to read, write, and manage their BigQuery datasets/tables directly from DuckDB using standard SQL queries. For detailed setup and usage instructions, visit the extension repository.

Added Functions

function_name function_type description comment examples
bigquery_attach table Attach to a BigQuery project. NULL [ATTACH 'project=my_gcp_project' as bq (TYPE bigquery);]
bigquery_scan table Scan a single table directly from BigQuery. NULL [SELECT * FROM bigquery_scan('my_gcp_project.quacking_dataset.duck_tbl');]
bigquery_query table Run a custom GoogleSQL query in BigQuery and read the results. NULL [SELECT * FROM bigquery_query('bq', 'SELECT * FROM quacking_dataset.duck_tbl WHERE duck_id = 123');]
bigquery_execute table Execute an arbitrary GoogleSQL query in BigQuery. NULL [CALL bigquery_execute('bq', 'CREATE SCHEMA deluxe_dataset OPTIONS(location="us", default_table_expiration_days=3.75);')]
bigquery_jobs table List jobs in a BigQuery project. NULL [SELECT * FROM bigquery_jobs('bq');]
bigquery_clear_cache table Clear the internal caches to refetch the most current project information from BigQuery. NULL [CALL bigquery_clear_cache();]
bigquery_load table NULL NULL NULL
bigquery_normalize_geography scalar NULL NULL NULL

Overloaded Functions

This extension does not add any function overloads.

Added Types

This extension does not add any types.

Added Settings

name description input_type scope aliases
bq_arrow_compression Compression codec for BigQuery Storage Read API. Options: UNSPECIFIED, LZ4_FRAME, ZSTD.Default is LZ4_FRAME. VARCHAR GLOBAL []
bq_bignumeric_as_varchar Read BigQuery BIGNUMERIC data type as VARCHAR BOOLEAN GLOBAL []
bq_curl_ca_bundle_path Path to the CA bundle for curl VARCHAR GLOBAL []
bq_debug_show_queries DEBUG SETTING: print all queries sent to BigQuery to stdout BOOLEAN GLOBAL []
bq_default_location Default location for BigQuery queries VARCHAR GLOBAL []
bq_enable_inflight_request_windowing Whether to allow multiple BigQuery Storage Write AppendRows requests to remain in flight before waiting for acknowledgements. Usually faster, but slightly less memory efficient because more unacknowledged requests can be buffered at once. BOOLEAN GLOBAL []
bq_experimental_enable_sql_parser Whether to enable BigQuery CREATE TABLE clause parsing extensions BOOLEAN GLOBAL []
bq_experimental_filter_pushdown Whether to use filter pushdown (currently experimental) BOOLEAN GLOBAL []
bq_experimental_use_info_schema Whether to fetch table infos from BQ information schema (currently experimental). Can be significantly faster than fetching from REST API. BOOLEAN GLOBAL []
bq_max_read_streams Maximum number of read streams for BigQuery Storage Read. Set to 0 to automatically match the number of DuckDB threads. preserve_insertion_order must be false for parallelization to work. BIGINT GLOBAL []
bq_query_timeout_ms Timeout for BigQuery queries in milliseconds BIGINT GLOBAL []