Search Shortcut cmd + k | ctrl + k
Search cmd+k ctrl+k
1.0 (stable)
Conversion between DuckDB and Python

This page documents the rules for converting Python objects to DuckDB and DuckDB results to Python.

Object Conversion: Python Object to DuckDB

This is a mapping of Python object types to DuckDB Logical Types:

  • NoneNULL
  • boolBOOLEAN
  • datetime.timedeltaINTERVAL
  • strVARCHAR
  • bytearrayBLOB
  • memoryviewBLOB
  • decimal.DecimalDECIMAL / DOUBLE
  • uuid.UUIDUUID

The rest of the conversion rules are as follows.


Since integers can be of arbitrary size in Python, there is not a one-to-one conversion possible for ints. Intead we perform these casts in order until one succeeds:


When using the DuckDB Value class, it's possible to set a target type, which will influence the conversion.


These casts are tried in order until one succeeds:



For datetime we will check pandas.isnull if it's available and return NULL if it returns true. We check against datetime.datetime.min and datetime.datetime.max to convert to -inf and +inf respectively.

If the datetime has tzinfo, we will use TIMESTAMPTZ, otherwise it becomes TIMESTAMP.


If the time has tzinfo, we will use TIMETZ, otherwise it becomes TIME.

date converts to the DATE type. We check against and to convert to -inf and +inf respectively.


bytes converts to BLOB by default, when it's used to construct a Value object of type BITSTRING, it maps to BITSTRING instead.


list becomes a LIST type of the "most permissive" type of its children, for example:

my_list_value = [

Will become VARCHAR[] because 12345 can convert to VARCHAR but test can not convert to INTEGER.

[12345, test]


The dict object can convert to either STRUCT(...) or MAP(..., ...) depending on its structure. If the dict has a structure similar to:

my_map_dict = {
    "key": [
        1, 2, 3
    "value": [
        "one", "two", "three"

Then we'll convert it to a MAP of key-value pairs of the two lists zipped together. The example above becomes a MAP(INTEGER, VARCHAR):

{1=one, 2=two, 3=three}

The names of the fields matter and the two lists need to have the same size.

Otherwise we'll try to convert it to a STRUCT.

my_struct_dict = {
    1: "one",
    "2": 2,
    "three": [1, 2, 3],
    False: True


{'1': one, '2': 2, 'three': [1, 2, 3], 'False': true}

Every key of the dictionary is converted to string.


tuple converts to LIST by default, when it's used to construct a Value object of type STRUCT it will convert to STRUCT instead.

numpy.ndarray and numpy.datetime64

ndarray and datetime64 are converted by calling tolist() and converting the result of that.

Result Conversion: DuckDB Results to Python

DuckDB's Python client provides multiple additional methods that can be used to efficiently retrieve data.


  • fetchnumpy() fetches the data as a dictionary of NumPy arrays


  • df() fetches the data as a Pandas DataFrame
  • fetchdf() is an alias of df()
  • fetch_df() is an alias of df()
  • fetch_df_chunk(vector_multiple) fetches a portion of the results into a DataFrame. The number of rows returned in each chunk is the vector size (2048 by default) * vector_multiple (1 by default).

Apache Arrow

  • arrow() fetches the data as an Arrow table
  • fetch_arrow_table() is an alias of arrow()
  • fetch_record_batch(chunk_size) returns an Arrow record batch reader with chunk_size rows per batch


  • pl() fetches the data as a Polars DataFrame


Below are some examples using this functionality. See the Python guides for more examples.

Fetch as Pandas DataFrame:

df = con.execute("SELECT * FROM items").fetchdf()
       item   value  count
0     jeans    20.0      1
1    hammer    42.2      2
2    laptop  2000.0      1
3  chainsaw   500.0     10
4    iphone   300.0      2

Fetch as dictionary of NumPy arrays:

arr = con.execute("SELECT * FROM items").fetchnumpy()
{'item': masked_array(data=['jeans', 'hammer', 'laptop', 'chainsaw', 'iphone'],
             mask=[False, False, False, False, False],
            dtype=object), 'value': masked_array(data=[20.0, 42.2, 2000.0, 500.0, 300.0],
             mask=[False, False, False, False, False],
       fill_value=1e+20), 'count': masked_array(data=[1, 2, 1, 10, 2],
             mask=[False, False, False, False, False],

Fetch as an Arrow table. Converting to Pandas afterwards just for pretty printing:

tbl = con.execute("SELECT * FROM items").fetch_arrow_table()
       item    value  count
0     jeans    20.00      1
1    hammer    42.20      2
2    laptop  2000.00      1
3  chainsaw   500.00     10
4    iphone   300.00      2
About this page

Last modified: 2024-07-10