- Installation
- Guides
- Data Import & Export
- CSV Import
- CSV Export
- Parquet Import
- Parquet Export
- Query Parquet
- HTTP Parquet Import
- S3 Parquet Import
- S3 Parquet Export
- SQLite Import
- Postgres Import
- Meta Queries
- Python
- Install
- Execute SQL
- Jupyter Notebooks
- SQL on Pandas
- Import From Pandas
- Export To Pandas
- SQL on Arrow
- Import From Arrow
- Export To Arrow
- Relational API on Pandas
- Multiple Python Threads
- DuckDB with Ibis
- DuckDB with Fugue
- DuckDB with Polars
- DuckDB with Vaex
- DuckDB with DataFusion
- DuckDB with fsspec filesystems
- SQL Editors
- Data Viewers
- Documentation
- Connect
- Data Import
- Client APIs
- Overview
- Python
- R
- Java
- Julia
- C
- Overview
- Startup
- Configure
- Query
- Data Chunks
- Values
- Types
- Prepared Statements
- Appender
- Table Functions
- Replacement Scans
- API Reference
- C++
- Node.js
- Wasm
- ODBC
- CLI
- SQL
- Introduction
- Statements
- Overview
- Select
- Insert
- Delete
- Update
- Create Schema
- Create Table
- Create View
- Create Sequence
- Create Macro
- Drop
- Alter Table
- Copy
- Export
- Attach
- Query Syntax
- SELECT
- FROM
- WHERE
- GROUP BY
- GROUPING SETS
- HAVING
- ORDER BY
- LIMIT
- SAMPLE
- UNNEST
- WITH
- WINDOW
- QUALIFY
- VALUES
- FILTER
- Set Operations
- Data Types
- Overview
- NULL Values
- Boolean
- Enum
- Numeric
- Text
- Date
- Timestamp
- Interval
- Blob
- Bitstring
- List
- Struct
- Map
- Union
- Expressions
- Functions
- Overview
- Enum Functions
- Numeric Functions
- Text Functions
- Pattern Matching
- Date Functions
- Timestamp Functions
- Timestamp With Time Zone Functions
- Time Functions
- Interval Functions
- Date Formats
- Date Parts
- Blob Functions
- Bitstring Functions
- Nested Functions
- Utility Functions
- Indexes
- Aggregates
- Window Functions
- Samples
- Information Schema
- Metadata Functions
- Configuration
- Pragmas
- Extensions
- Development
- Testing
- Internals Overview
- Storage Versions & Format
- Execution Format
- Profiling
- Release Dates
- Building
- Sitemap
- Why DuckDB
- Media
- FAQ
- Code of Conduct
- Live Demo
DuckDB’s Python client provides multiple additional methods that can be used to efficiently retrieve data.
NumPy
fetchnumpy()
fetches the data as a dictionary of NumPy arrays
Pandas
df()
fetches the data as a Pandas DataFramefetchdf()
is an alias ofdf()
fetch_df()
is an alias ofdf()
fetch_df_chunk(vector_multiple)
fetches a portion of the results into a DataFrame. The number of rows returned in each chunk is the vector size (2048 by default) * vector_multiple (1 by default).
Apache Arrow
arrow()
fetches the data as an Arrow tablefetch_arrow_table()
is an alias ofarrow()
fetch_record_batch(chunk_size)
returns an Arrow record batch reader withchunk_size
rows per batch
Polars
pl()
fetches the data as a Polars DataFrame
Below are some examples using this functionality. See the Python guides for more examples.
# fetch as Pandas DataFrame
df = con.execute("SELECT * FROM items").fetchdf()
print(df)
# item value count
# 0 jeans 20.0 1
# 1 hammer 42.2 2
# 2 laptop 2000.0 1
# 3 chainsaw 500.0 10
# 4 iphone 300.0 2
# fetch as dictionary of numpy arrays
arr = con.execute("SELECT * FROM items").fetchnumpy()
print(arr)
# {'item': masked_array(data=['jeans', 'hammer', 'laptop', 'chainsaw', 'iphone'],
# mask=[False, False, False, False, False],
# fill_value='?',
# dtype=object), 'value': masked_array(data=[20.0, 42.2, 2000.0, 500.0, 300.0],
# mask=[False, False, False, False, False],
# fill_value=1e+20), 'count': masked_array(data=[1, 2, 1, 10, 2],
# mask=[False, False, False, False, False],
# fill_value=999999,
# dtype=int32)}
# fetch as an Arrow table. Converting to Pandas afterwards just for pretty printing
tbl = con.execute("SELECT * FROM items").fetch_arrow_table()
print(tbl.to_pandas())
# item value count
# 0 jeans 20.00 1
# 1 hammer 42.20 2
# 2 laptop 2000.00 1
# 3 chainsaw 500.00 10
# 4 iphone 300.00 2
Search Shortcut cmd + k | ctrl + k