- Installation
- Guides
- Data Import & Export
- CSV Import
- CSV Export
- Parquet Import
- Parquet Export
- Query Parquet
- HTTP Parquet Import
- S3 Parquet Import
- S3 Parquet Export
- SQLite Import
- Postgres Import
- Meta Queries
- Python
- Install
- Execute SQL
- Jupyter Notebooks
- SQL on Pandas
- Import From Pandas
- Export To Pandas
- SQL on Arrow
- Import From Arrow
- Export To Arrow
- Relational API on Pandas
- Multiple Python Threads
- DuckDB with Ibis
- DuckDB with Fugue
- DuckDB with Polars
- DuckDB with Vaex
- DuckDB with DataFusion
- DuckDB with fsspec filesystems
- SQL Editors
- Data Viewers
- Documentation
- Connect
- Data Import
- Client APIs
- Overview
- Python
- R
- Java
- Julia
- C
- Overview
- Startup
- Configure
- Query
- Data Chunks
- Values
- Types
- Prepared Statements
- Appender
- Table Functions
- Replacement Scans
- API Reference
- C++
- Node.js
- Wasm
- ODBC
- CLI
- SQL
- Introduction
- Statements
- Overview
- Select
- Insert
- Delete
- Update
- Create Schema
- Create Table
- Create View
- Create Sequence
- Create Macro
- Drop
- Alter Table
- Copy
- Export
- Attach
- Query Syntax
- SELECT
- FROM
- WHERE
- GROUP BY
- GROUPING SETS
- HAVING
- ORDER BY
- LIMIT
- SAMPLE
- UNNEST
- WITH
- WINDOW
- QUALIFY
- VALUES
- FILTER
- Set Operations
- Data Types
- Overview
- NULL Values
- Boolean
- Enum
- Numeric
- Text
- Date
- Timestamp
- Interval
- Blob
- Bitstring
- List
- Struct
- Map
- Union
- Expressions
- Functions
- Overview
- Enum Functions
- Numeric Functions
- Text Functions
- Pattern Matching
- Date Functions
- Timestamp Functions
- Timestamp With Time Zone Functions
- Time Functions
- Interval Functions
- Date Formats
- Date Parts
- Blob Functions
- Bitstring Functions
- Nested Functions
- Utility Functions
- Indexes
- Aggregates
- Window Functions
- Samples
- Information Schema
- Metadata Functions
- Configuration
- Pragmas
- Extensions
- Development
- Testing
- Internals Overview
- Storage Versions & Format
- Execution Format
- Profiling
- Release Dates
- Building
- Sitemap
- Why DuckDB
- Media
- FAQ
- Code of Conduct
- Live Demo
This feature is experimental, and is subject to change
DuckDB support for fsspec
filesystems allows querying data in filesystems that DuckDB’s httpfs
extension does not support. fsspec
has a large number of inbuilt filesystems, and there are also many external implementations. This capability is only available in DuckDB’s Python client because fsspec
is a Python library, while the httpfs
extension is available in many DuckDB clients.
Example
The following is an example of using fsspec
to query a file in Google Cloud Storage (instead of using their s3 inter-compatibility api).
Firstly, you must install duckdb
and fsspec
, and a filesystem interface of your choice
$ pip install duckdb fsspec gcsfs
then you can register whichever filesystem you’d like to query
import duckdb
from fsspec import filesystem
# this line will throw an exception if the appropriate filesystem interface is not installed
duckdb.register_filesystem(filesystem('gcs'))
duckdb.sql("select * from read_csv_auto('gcs:///bucket/file.csv')")
Potential issues
Please also note, that as these filesystems are not implemented in C++, their performance may not be comparable to the ones provided by the httpfs
extension.
It’s also worth noting that as they are third party libraries, they may contain bugs that are beyond our control.