Search Shortcut cmd + k | ctrl + k
Search cmd+k ctrl+k
0.10 (stable)
Using fsspec Filesystems

DuckDB support for fsspec filesystems allows querying data in filesystems that DuckDB’s httpfs extension does not support. fsspec has a large number of inbuilt filesystems, and there are also many external implementations. This capability is only available in DuckDB’s Python client because fsspec is a Python library, while the httpfs extension is available in many DuckDB clients.

Example

The following is an example of using fsspec to query a file in Google Cloud Storage (instead of using their S3-compatible API).

Firstly, you must install duckdb and fsspec, and a filesystem interface of your choice.

pip install duckdb fsspec gcsfs

Then, you can register whichever filesystem you’d like to query:

import duckdb
from fsspec import filesystem

# this line will throw an exception if the appropriate filesystem interface is not installed
duckdb.register_filesystem(filesystem('gcs'))

duckdb.sql("SELECT * FROM read_csv('gcs:///bucket/file.csv')")

These filesystems are not implemented in C++, hence, their performance may not be comparable to the ones provided by the httpfs extension. It is also worth noting that as they are third-party libraries, they may contain bugs that are beyond our control.

About this page

Last modified: 2024-04-15