bigquery

Search Shortcut cmd + k | ctrl + k

Documentation

bigquery

Downloads 12.3kthis week

GitHub stars 144

Extension repository on GitHub

Extension descriptor (YAML)

Integrates DuckDB with Google BigQuery, allowing direct querying and management of BigQuery datasets

Maintainer(s): hafenkran

Installing and Loading

INSTALL bigquery FROM community;
LOAD bigquery;

Example

-- Attach to your BigQuery Project
D ATTACH 'project=my_gcp_project' AS bq (TYPE bigquery, READ_ONLY);

-- Show all tables in all datasets in the attached BigQuery project
D SHOW ALL TABLES;
┌──────────┬──────────────────┬──────────┬──────────────┬───────────────────┬───────────┐
│ database │      schema      │   name   │ column_names │   column_types    │ temporary │
│ varchar  │     varchar      │  varchar │  varchar[]   │     varchar[]     │  boolean  │
├──────────┼──────────────────┼──────────┼──────────────┼───────────────────┼───────────┤
│ bq       │ quacking_dataset │ duck_tbl │ [i, s]       │ [BIGINT, VARCHAR] │ false     │
| bq       | barking_dataset  | dog_tbl  | [i, s]       | [BIGINT, VARCHAR] │ false     |
└──────────┴──────────────────┴──────────┴──────────────┴───────────────────┴───────────┘

-- Select data from a specific table in BigQuery
D SELECT * FROM bq.quacking_dataset.duck_tbl;
┌───────┬────────────────┐
│   i   │       s        │
│ int32 │    varchar     │
├───────┼────────────────┤
│    12 │ quack 🦆       │
│    13 │ quack quack 🦆 │
└───────┴────────────────┘

This community-maintained extension allows DuckDB to connect to Google BigQuery using the BigQuery Storage (read/write) and REST APIs. It enables users to read, write, and manage their BigQuery datasets/tables directly from DuckDB using standard SQL queries. For detailed setup and usage instructions, visit the extension repository.

Added Functions

function_name	function_type	description	comment	examples
bigquery_attach	table	Attach to a BigQuery project.	NULL	[ATTACH 'project=my_gcp_project' as bq (TYPE bigquery);]
bigquery_scan	table	Scan a single table directly from BigQuery.	NULL	[SELECT * FROM bigquery_scan('my_gcp_project.quacking_dataset.duck_tbl');]
bigquery_query	table	Run a custom GoogleSQL query in BigQuery and read the results.	NULL	[SELECT * FROM bigquery_query('bq', 'SELECT * FROM quacking_dataset.duck_tbl WHERE duck_id = 123');]
bigquery_execute	table	Execute an arbitrary GoogleSQL query in BigQuery.	NULL	[CALL bigquery_execute('bq', 'CREATE SCHEMA deluxe_dataset OPTIONS(location="us", default_table_expiration_days=3.75);')]
bigquery_jobs	table	List jobs in a BigQuery project.	NULL	[SELECT * FROM bigquery_jobs('bq');]
bigquery_clear_cache	table	Clear the internal caches to refetch the most current project information from BigQuery.	NULL	[CALL bigquery_clear_cache();]
bigquery_arrow_scan	table	NULL	NULL	NULL

Added Settings

name	description	input_type	scope	aliases
bq_arrow_compression	Compression codec for BigQuery Storage Read API. Options: UNSPECIFIED, LZ4_FRAME, ZSTD.Default is LZ4_FRAME.	VARCHAR	GLOBAL	[]
bq_bignumeric_as_varchar	Read BigQuery BIGNUMERIC data type as VARCHAR	BOOLEAN	GLOBAL	[]
bq_curl_ca_bundle_path	Path to the CA bundle for curl	VARCHAR	GLOBAL	[]
bq_debug_show_queries	DEBUG SETTING: print all queries sent to BigQuery to stdout	BOOLEAN	GLOBAL	[]
bq_default_location	Default location for BigQuery queries	VARCHAR	GLOBAL	[]
bq_experimental_enable_bigquery_options	Whether to enable BigQuery OPTIONS in CREATE statements	BOOLEAN	GLOBAL	[]
bq_experimental_filter_pushdown	Whether to use filter pushdown (currently experimental)	BOOLEAN	GLOBAL	[]
bq_experimental_use_incubating_scan	Whether to use the incubating BigQuery scan implementation. This is currently experimental and is targeted to become the default in the future. DEPRECATED: Use bq_use_legacy_scan instead. This setting will be removed in a future version.	BOOLEAN	GLOBAL	[]
bq_experimental_use_info_schema	Whether to fetch table infos from BQ information schema (currently experimental). Can be significantly faster than fetching from REST API.	BOOLEAN	GLOBAL	[]
bq_geography_as_geometry	Whether to return BigQuery GEOGRAPHY columns as DuckDB GEOMETRY types (requires spatial extension). Default is false (returns WKT strings).	BOOLEAN	GLOBAL	[]
bq_max_read_streams	Maximum number of read streams for BigQuery Storage Read. Set to 0 to automatically match the number of DuckDB threads. `preserve_insertion_order` must be false for parallelization to work.	BIGINT	GLOBAL	[]
bq_query_timeout_ms	Timeout for BigQuery queries in milliseconds	BIGINT	GLOBAL	[]
bq_use_legacy_scan	Whether to use legacy scan implementation for BigQuery tables. Default is false (uses optimized Arrow-based implementation).	BOOLEAN	GLOBAL	[]

Installing and Loading

Example

About bigquery

Added Functions

Added Settings

In this article