- Installation
- Guides
- Data Import & Export
- CSV Import
- CSV Export
- Parquet Import
- Parquet Export
- Query Parquet
- HTTP Parquet Import
- S3 Parquet Import
- S3 Parquet Export
- SQLite Import
- Postgres Import
- Meta Queries
- Python
- Install
- Execute SQL
- Jupyter Notebooks
- SQL on Pandas
- Import From Pandas
- Export To Pandas
- SQL on Arrow
- Import From Arrow
- Export To Arrow
- Relational API on Pandas
- DuckDB with Ibis
- DuckDB with Polars
- DuckDB with Vaex
- SQL Editors
- Data Viewers
- Documentation
- Connect
- Data Import
- Client APIs
- Overview
- Python
- R
- Java
- Julia
- C
- Overview
- Startup
- Configure
- Query
- Data Chunks
- Values
- Types
- Prepared Statements
- Appender
- Table Functions
- Replacement Scans
- API Reference
- C++
- Node.js
- WASM
- ODBC
- CLI
- SQL
- Introduction
- Statements
- Overview
- Select
- Insert
- Delete
- Update
- Create Schema
- Create Table
- Create View
- Create Sequence
- Create Macro
- Drop
- Alter Table
- Copy
- Export
- Query Syntax
- SELECT
- FROM
- WHERE
- GROUP BY
- GROUPING SETS
- HAVING
- ORDER BY
- LIMIT
- SAMPLE
- UNNEST
- WITH
- WINDOW
- QUALIFY
- VALUES
- FILTER
- Set Operations
- Data Types
- Expressions
- Functions
- Overview
- Enum Functions
- Numeric Functions
- Text Functions
- Pattern Matching
- Date Functions
- Timestamp Functions
- Timestamp With Time Zone Functions
- Time Functions
- Interval Functions
- Date Formats
- Date Parts
- Blob Functions
- Nested Functions
- Utility Functions
- Indexes
- Aggregates
- Window Functions
- Samples
- Information Schema
- Configuration
- Pragmas
- Extensions
- Development
- DuckDB Repositories
- Testing
- Internals Overview
- Storage Versions & Format
- Execution Format
- Profiling
- Release Dates
- Building
- Benchmark Suite
- Sitemap
- Why DuckDB
- Media
- FAQ
- Code of Conduct
- Live Demo
How to create a table from a Pandas DataFrame
CREATE TABLE AS
and INSERT INTO
can be used to create a table from any query. We can then create tables or insert into existing tables by referring to referring to the Pandas DataFrame in the query.
import duckdb
import pandas
# connect to an in-memory database
con = duckdb.connect()
my_df = pandas.DataFrame.from_dict({'a': [42]})
# create the table "my_table" from the DataFrame "my_df"
con.execute("CREATE TABLE my_table AS SELECT * FROM my_df")
# insert into the table "my_table" from the DataFrame "my_df"
con.execute("INSERT INTO my_table SELECT * FROM my_df")
‘object’ columns
pandas.DataFrame columns of an object
dtype require some special care, since this stores values of arbitrary type.
To convert these columns to DuckDB, we first go through an analyze phase before converting the values.
In this analyze phase a sample of all the rows of the column are analyzed to determine the target type.
This sample size is by default set to 1000.
If the type picked during the analyze step is wrong, this will result in a “Failed to cast value:” error, in which case you will need to increase the sample size.
The sample size can be changed by setting the pandas_analyze_sample
config option.
# example setting the sample size to 100000
duckdb.default_connection.execute("SET GLOBAL pandas_analyze_sample=100000")