- duckdb.threadsafety bool¶
-
Indicates that this package is threadsafe
- duckdb.apilevel int¶
-
Indicates which Python DBAPI version this package implements
- duckdb.paramstyle str¶
-
Indicates which parameter style duckdb supports
- duckdb.default_connection duckdb.DuckDBPyConnection¶
-
The connection that is used by default if you don’t explicitly pass one to the root methods in this module
- exception duckdb.BinderException¶
-
Bases:
ProgrammingError
- duckdb.CaseExpression(condition: duckdb.duckdb.Expression, value: duckdb.duckdb.Expression) duckdb.duckdb.Expression ¶
- exception duckdb.CatalogException¶
-
Bases:
ProgrammingError
- duckdb.CoalesceOperator(*args) duckdb.duckdb.Expression ¶
- duckdb.ColumnExpression(name: str) duckdb.duckdb.Expression ¶
-
Create a column reference from the provided column name
- exception duckdb.ConnectionException¶
-
Bases:
OperationalError
- duckdb.ConstantExpression(value: object) duckdb.duckdb.Expression ¶
-
Create a constant expression from the provided value
- exception duckdb.ConstraintException¶
-
Bases:
IntegrityError
- exception duckdb.DataError¶
-
Bases:
DatabaseError
- class duckdb.DuckDBPyConnection¶
-
Bases:
pybind11_object
- append(self: duckdb.duckdb.DuckDBPyConnection, table_name: str, df: pandas.DataFrame, *, by_name: bool = False) duckdb.duckdb.DuckDBPyConnection ¶
-
Append the passed DataFrame to the named table
- array_type(self: duckdb.duckdb.DuckDBPyConnection, type: duckdb.duckdb.typing.DuckDBPyType, size: int) duckdb.duckdb.typing.DuckDBPyType ¶
-
Create an array type object of ‘type’
- arrow(self: duckdb.duckdb.DuckDBPyConnection, rows_per_batch: int = 1000000) pyarrow.lib.Table ¶
-
Fetch a result as Arrow table following execute()
- begin(self: duckdb.duckdb.DuckDBPyConnection) duckdb.duckdb.DuckDBPyConnection ¶
-
Start a new transaction
- checkpoint(self: duckdb.duckdb.DuckDBPyConnection) duckdb.duckdb.DuckDBPyConnection ¶
-
Synchronizes data in the write-ahead log (WAL) to the database data file (no-op for in-memory connections)
- close(self: duckdb.duckdb.DuckDBPyConnection) None ¶
-
Close the connection
- commit(self: duckdb.duckdb.DuckDBPyConnection) duckdb.duckdb.DuckDBPyConnection ¶
-
Commit changes performed within a transaction
- create_function(self: duckdb.duckdb.DuckDBPyConnection, name: str, function: Callable, parameters: object = None, return_type: duckdb.duckdb.typing.DuckDBPyType = None, *, type: duckdb.duckdb.functional.PythonUDFType = <PythonUDFType.NATIVE: 0>, null_handling: duckdb.duckdb.functional.FunctionNullHandling = <FunctionNullHandling.DEFAULT: 0>, exception_handling: duckdb.duckdb.PythonExceptionHandling = <PythonExceptionHandling.DEFAULT: 0>, side_effects: bool = False) duckdb.duckdb.DuckDBPyConnection ¶
-
Create a DuckDB function out of the passing in Python function so it can be used in queries
- cursor(self: duckdb.duckdb.DuckDBPyConnection) duckdb.duckdb.DuckDBPyConnection ¶
-
Create a duplicate of the current connection
- decimal_type(self: duckdb.duckdb.DuckDBPyConnection, width: int, scale: int) duckdb.duckdb.typing.DuckDBPyType ¶
-
Create a decimal type with ‘width’ and ‘scale’
- property description¶
-
Get result set attributes, mainly column names
- df(self: duckdb.duckdb.DuckDBPyConnection, *, date_as_object: bool = False) pandas.DataFrame ¶
-
Fetch a result as DataFrame following execute()
- dtype(self: duckdb.duckdb.DuckDBPyConnection, type_str: str) duckdb.duckdb.typing.DuckDBPyType ¶
-
Create a type object by parsing the ‘type_str’ string
- duplicate(self: duckdb.duckdb.DuckDBPyConnection) duckdb.duckdb.DuckDBPyConnection ¶
-
Create a duplicate of the current connection
- enum_type(self: duckdb.duckdb.DuckDBPyConnection, name: str, type: duckdb.duckdb.typing.DuckDBPyType, values: list) duckdb.duckdb.typing.DuckDBPyType ¶
-
Create an enum type of underlying ‘type’, consisting of the list of ‘values’
- execute(self: duckdb.duckdb.DuckDBPyConnection, query: object, parameters: object = None) duckdb.duckdb.DuckDBPyConnection ¶
-
Execute the given SQL query, optionally using prepared statements with parameters set
- executemany(self: duckdb.duckdb.DuckDBPyConnection, query: object, parameters: object = None) duckdb.duckdb.DuckDBPyConnection ¶
-
Execute the given prepared statement multiple times using the list of parameter sets in parameters
- extract_statements(self: duckdb.duckdb.DuckDBPyConnection, query: str) list ¶
-
Parse the query string and extract the Statement object(s) produced
- fetch_arrow_table(self: duckdb.duckdb.DuckDBPyConnection, rows_per_batch: int = 1000000) pyarrow.lib.Table ¶
-
Fetch a result as Arrow table following execute()
- fetch_df(self: duckdb.duckdb.DuckDBPyConnection, *, date_as_object: bool = False) pandas.DataFrame ¶
-
Fetch a result as DataFrame following execute()
- fetch_df_chunk(self: duckdb.duckdb.DuckDBPyConnection, vectors_per_chunk: int = 1, *, date_as_object: bool = False) pandas.DataFrame ¶
-
Fetch a chunk of the result as DataFrame following execute()
- fetch_record_batch(self: duckdb.duckdb.DuckDBPyConnection, rows_per_batch: int = 1000000) pyarrow.lib.RecordBatchReader ¶
-
Fetch an Arrow RecordBatchReader following execute()
- fetchall(self: duckdb.duckdb.DuckDBPyConnection) list ¶
-
Fetch all rows from a result following execute
- fetchdf(self: duckdb.duckdb.DuckDBPyConnection, *, date_as_object: bool = False) pandas.DataFrame ¶
-
Fetch a result as DataFrame following execute()
- fetchmany(self: duckdb.duckdb.DuckDBPyConnection, size: int = 1) list ¶
-
Fetch the next set of rows from a result following execute
- fetchnumpy(self: duckdb.duckdb.DuckDBPyConnection) dict ¶
-
Fetch a result as list of NumPy arrays following execute
- fetchone(self: duckdb.duckdb.DuckDBPyConnection) Optional[tuple] ¶
-
Fetch a single row from a result following execute
- filesystem_is_registered(self: duckdb.duckdb.DuckDBPyConnection, name: str) bool ¶
-
Check if a filesystem with the provided name is currently registered
- from_arrow(self: duckdb.duckdb.DuckDBPyConnection, arrow_object: object) duckdb.duckdb.DuckDBPyRelation ¶
-
Create a relation object from an Arrow object
- from_csv_auto(self: duckdb.duckdb.DuckDBPyConnection, path_or_buffer: object, **kwargs) duckdb.duckdb.DuckDBPyRelation ¶
-
Create a relation object from the CSV file in ‘name’
- from_df(self: duckdb.duckdb.DuckDBPyConnection, df: pandas.DataFrame) duckdb.duckdb.DuckDBPyRelation ¶
-
Create a relation object from the DataFrame in df
- from_parquet(*args, **kwargs)¶
-
Overloaded function.
from_parquet(self: duckdb.duckdb.DuckDBPyConnection, file_glob: str, binary_as_string: bool = False, *, file_row_number: bool = False, filename: bool = False, hive_partitioning: bool = False, union_by_name: bool = False, compression: object = None) -> duckdb.duckdb.DuckDBPyRelation
Create a relation object from the Parquet files in file_glob
from_parquet(self: duckdb.duckdb.DuckDBPyConnection, file_globs: list[str], binary_as_string: bool = False, *, file_row_number: bool = False, filename: bool = False, hive_partitioning: bool = False, union_by_name: bool = False, compression: object = None) -> duckdb.duckdb.DuckDBPyRelation
Create a relation object from the Parquet files in file_globs
- from_query(self: duckdb.duckdb.DuckDBPyConnection, query: object, *, alias: str = '', params: object = None) duckdb.duckdb.DuckDBPyRelation ¶
-
Run a SQL query. If it is a SELECT statement, create a relation object from the given SQL query, otherwise run the query as-is.
- from_substrait(self: duckdb.duckdb.DuckDBPyConnection, proto: bytes) duckdb.duckdb.DuckDBPyRelation ¶
-
Create a query object from protobuf plan
- from_substrait_json(self: duckdb.duckdb.DuckDBPyConnection, json: str) duckdb.duckdb.DuckDBPyRelation ¶
-
Create a query object from a JSON protobuf plan
- get_substrait(self: duckdb.duckdb.DuckDBPyConnection, query: str, *, enable_optimizer: bool = True) duckdb.duckdb.DuckDBPyRelation ¶
-
Serialize a query to protobuf
- get_substrait_json(self: duckdb.duckdb.DuckDBPyConnection, query: str, *, enable_optimizer: bool = True) duckdb.duckdb.DuckDBPyRelation ¶
-
Serialize a query to protobuf on the JSON format
- get_table_names(self: duckdb.duckdb.DuckDBPyConnection, query: str) set[str] ¶
-
Extract the required table names from a query
- install_extension(self: duckdb.duckdb.DuckDBPyConnection, extension: str, *, force_install: bool = False, repository: object = None, repository_url: object = None, version: object = None) None ¶
-
Install an extension by name, with an optional version and/or repository to get the extension from
- interrupt(self: duckdb.duckdb.DuckDBPyConnection) None ¶
-
Interrupt pending operations
- list_filesystems(self: duckdb.duckdb.DuckDBPyConnection) list ¶
-
List registered filesystems, including builtin ones
- list_type(self: duckdb.duckdb.DuckDBPyConnection, type: duckdb.duckdb.typing.DuckDBPyType) duckdb.duckdb.typing.DuckDBPyType ¶
-
Create a list type object of ‘type’
- load_extension(self: duckdb.duckdb.DuckDBPyConnection, extension: str) None ¶
-
Load an installed extension
- map_type(self: duckdb.duckdb.DuckDBPyConnection, key: duckdb.duckdb.typing.DuckDBPyType, value: duckdb.duckdb.typing.DuckDBPyType) duckdb.duckdb.typing.DuckDBPyType ¶
-
Create a map type object from ‘key_type’ and ‘value_type’
- pl(self: duckdb.duckdb.DuckDBPyConnection, rows_per_batch: int = 1000000) duckdb::PolarsDataFrame ¶
-
Fetch a result as Polars DataFrame following execute()
- query(self: duckdb.duckdb.DuckDBPyConnection, query: object, *, alias: str = '', params: object = None) duckdb.duckdb.DuckDBPyRelation ¶
-
Run a SQL query. If it is a SELECT statement, create a relation object from the given SQL query, otherwise run the query as-is.
- read_csv(self: duckdb.duckdb.DuckDBPyConnection, path_or_buffer: object, **kwargs) duckdb.duckdb.DuckDBPyRelation ¶
-
Create a relation object from the CSV file in ‘name’
- read_json(self: duckdb.duckdb.DuckDBPyConnection, path_or_buffer: object, *, columns: Optional[object] = None, sample_size: Optional[object] = None, maximum_depth: Optional[object] = None, records: Optional[str] = None, format: Optional[str] = None, date_format: Optional[object] = None, timestamp_format: Optional[object] = None, compression: Optional[object] = None, maximum_object_size: Optional[object] = None, ignore_errors: Optional[object] = None, convert_strings_to_integers: Optional[object] = None, field_appearance_threshold: Optional[object] = None, map_inference_threshold: Optional[object] = None, maximum_sample_files: Optional[object] = None, filename: Optional[object] = None, hive_partitioning: Optional[object] = None, union_by_name: Optional[object] = None, hive_types: Optional[object] = None, hive_types_autocast: Optional[object] = None) duckdb.duckdb.DuckDBPyRelation ¶
-
Create a relation object from the JSON file in ‘name’
- read_parquet(*args, **kwargs)¶
-
Overloaded function.
read_parquet(self: duckdb.duckdb.DuckDBPyConnection, file_glob: str, binary_as_string: bool = False, *, file_row_number: bool = False, filename: bool = False, hive_partitioning: bool = False, union_by_name: bool = False, compression: object = None) -> duckdb.duckdb.DuckDBPyRelation
Create a relation object from the Parquet files in file_glob
read_parquet(self: duckdb.duckdb.DuckDBPyConnection, file_globs: list[str], binary_as_string: bool = False, *, file_row_number: bool = False, filename: bool = False, hive_partitioning: bool = False, union_by_name: bool = False, compression: object = None) -> duckdb.duckdb.DuckDBPyRelation
Create a relation object from the Parquet files in file_globs
- register(self: duckdb.duckdb.DuckDBPyConnection, view_name: str, python_object: object) duckdb.duckdb.DuckDBPyConnection ¶
-
Register the passed Python Object value for querying with a view
- register_filesystem(self: duckdb.duckdb.DuckDBPyConnection, filesystem: fsspec.AbstractFileSystem) None ¶
-
Register a fsspec compliant filesystem
- remove_function(self: duckdb.duckdb.DuckDBPyConnection, name: str) duckdb.duckdb.DuckDBPyConnection ¶
-
Remove a previously created function
- rollback(self: duckdb.duckdb.DuckDBPyConnection) duckdb.duckdb.DuckDBPyConnection ¶
-
Roll back changes performed within a transaction
- row_type(self: duckdb.duckdb.DuckDBPyConnection, fields: object) duckdb.duckdb.typing.DuckDBPyType ¶
-
Create a struct type object from ‘fields’
- property rowcount¶
-
Get result set row count
- sql(self: duckdb.duckdb.DuckDBPyConnection, query: object, *, alias: str = '', params: object = None) duckdb.duckdb.DuckDBPyRelation ¶
-
Run a SQL query. If it is a SELECT statement, create a relation object from the given SQL query, otherwise run the query as-is.
- sqltype(self: duckdb.duckdb.DuckDBPyConnection, type_str: str) duckdb.duckdb.typing.DuckDBPyType ¶
-
Create a type object by parsing the ‘type_str’ string
- string_type(self: duckdb.duckdb.DuckDBPyConnection, collation: str = '') duckdb.duckdb.typing.DuckDBPyType ¶
-
Create a string type with an optional collation
- struct_type(self: duckdb.duckdb.DuckDBPyConnection, fields: object) duckdb.duckdb.typing.DuckDBPyType ¶
-
Create a struct type object from ‘fields’
- table(self: duckdb.duckdb.DuckDBPyConnection, table_name: str) duckdb.duckdb.DuckDBPyRelation ¶
-
Create a relation object for the named table
- table_function(self: duckdb.duckdb.DuckDBPyConnection, name: str, parameters: object = None) duckdb.duckdb.DuckDBPyRelation ¶
-
Create a relation object from the named table function with given parameters
- tf(self: duckdb.duckdb.DuckDBPyConnection) dict ¶
-
Fetch a result as dict of TensorFlow Tensors following execute()
- torch(self: duckdb.duckdb.DuckDBPyConnection) dict ¶
-
Fetch a result as dict of PyTorch Tensors following execute()
- type(self: duckdb.duckdb.DuckDBPyConnection, type_str: str) duckdb.duckdb.typing.DuckDBPyType ¶
-
Create a type object by parsing the ‘type_str’ string
- union_type(self: duckdb.duckdb.DuckDBPyConnection, members: object) duckdb.duckdb.typing.DuckDBPyType ¶
-
Create a union type object from ‘members’
- unregister(self: duckdb.duckdb.DuckDBPyConnection, view_name: str) duckdb.duckdb.DuckDBPyConnection ¶
-
Unregister the view name
- unregister_filesystem(self: duckdb.duckdb.DuckDBPyConnection, name: str) None ¶
-
Unregister a filesystem
- values(self: duckdb.duckdb.DuckDBPyConnection, values: object) duckdb.duckdb.DuckDBPyRelation ¶
-
Create a relation object from the passed values
- view(self: duckdb.duckdb.DuckDBPyConnection, view_name: str) duckdb.duckdb.DuckDBPyRelation ¶
-
Create a relation object for the named view
- class duckdb.DuckDBPyRelation¶
-
Bases:
pybind11_object
- aggregate(self: duckdb.duckdb.DuckDBPyRelation, aggr_expr: object, group_expr: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Compute the aggregate aggr_expr by the optional groups group_expr on the relation
- property alias¶
-
Get the name of the current alias
- any_value(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Returns the first non-null value from a given column
- apply(self: duckdb.duckdb.DuckDBPyRelation, function_name: str, function_aggr: str, group_expr: str = '', function_parameter: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Compute the function of a single column or a list of columns by the optional groups on the relation
- arg_max(self: duckdb.duckdb.DuckDBPyRelation, arg_column: str, value_column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Finds the row with the maximum value for a value column and returns the value of that row for an argument column
- arg_min(self: duckdb.duckdb.DuckDBPyRelation, arg_column: str, value_column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Finds the row with the minimum value for a value column and returns the value of that row for an argument column
- arrow(self: duckdb.duckdb.DuckDBPyRelation, batch_size: int = 1000000) pyarrow.lib.Table ¶
-
Execute and fetch all rows as an Arrow Table
- avg(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Computes the average on a given column
- bit_and(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Computes the bitwise AND of all bits present in a given column
- bit_or(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Computes the bitwise OR of all bits present in a given column
- bit_xor(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Computes the bitwise XOR of all bits present in a given column
- bitstring_agg(self: duckdb.duckdb.DuckDBPyRelation, column: str, min: Optional[object] = None, max: Optional[object] = None, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Computes a bitstring with bits set for each distinct value in a given column
- bool_and(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Computes the logical AND of all values present in a given column
- bool_or(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Computes the logical OR of all values present in a given column
- close(self: duckdb.duckdb.DuckDBPyRelation) None ¶
-
Closes the result
- property columns¶
-
Return a list containing the names of the columns of the relation.
- count(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Computes the number of elements present in a given column
- create(self: duckdb.duckdb.DuckDBPyRelation, table_name: str) None ¶
-
Creates a new table named table_name with the contents of the relation object
- create_view(self: duckdb.duckdb.DuckDBPyRelation, view_name: str, replace: bool = True) duckdb.duckdb.DuckDBPyRelation ¶
-
Creates a view named view_name that refers to the relation object
- cume_dist(self: duckdb.duckdb.DuckDBPyRelation, window_spec: str, projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Computes the cumulative distribution within the partition
- dense_rank(self: duckdb.duckdb.DuckDBPyRelation, window_spec: str, projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Computes the dense rank within the partition
- describe(self: duckdb.duckdb.DuckDBPyRelation) duckdb.duckdb.DuckDBPyRelation ¶
-
Gives basic statistics (e.g., min, max) and if `NULL` exists for each column of the relation.
- property description¶
-
Return the description of the result
- df(self: duckdb.duckdb.DuckDBPyRelation, *, date_as_object: bool = False) pandas.DataFrame ¶
-
Execute and fetch all rows as a pandas DataFrame
- distinct(self: duckdb.duckdb.DuckDBPyRelation) duckdb.duckdb.DuckDBPyRelation ¶
-
Retrieve distinct rows from this relation object
- property dtypes¶
-
Return a list containing the types of the columns of the relation.
- except_(self: duckdb.duckdb.DuckDBPyRelation, other_rel: duckdb.duckdb.DuckDBPyRelation) duckdb.duckdb.DuckDBPyRelation ¶
-
Create the set except of this relation object with another relation object in other_rel
- execute(self: duckdb.duckdb.DuckDBPyRelation) duckdb.duckdb.DuckDBPyRelation ¶
-
Transform the relation into a result set
- explain(self: duckdb.duckdb.DuckDBPyRelation, type: duckdb.duckdb.ExplainType = 'standard') str ¶
- favg(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Computes the average of all values present in a given column using a more accurate floating point summation (Kahan Sum)
- fetch_arrow_reader(self: duckdb.duckdb.DuckDBPyRelation, batch_size: int = 1000000) pyarrow.lib.RecordBatchReader ¶
-
Execute and return an Arrow Record Batch Reader that yields all rows
- fetch_arrow_table(self: duckdb.duckdb.DuckDBPyRelation, batch_size: int = 1000000) pyarrow.lib.Table ¶
-
Execute and fetch all rows as an Arrow Table
- fetch_df_chunk(self: duckdb.duckdb.DuckDBPyRelation, vectors_per_chunk: int = 1, *, date_as_object: bool = False) pandas.DataFrame ¶
-
Execute and fetch a chunk of the rows
- fetchall(self: duckdb.duckdb.DuckDBPyRelation) list ¶
-
Execute and fetch all rows as a list of tuples
- fetchdf(self: duckdb.duckdb.DuckDBPyRelation, *, date_as_object: bool = False) pandas.DataFrame ¶
-
Execute and fetch all rows as a pandas DataFrame
- fetchmany(self: duckdb.duckdb.DuckDBPyRelation, size: int = 1) list ¶
-
Execute and fetch the next set of rows as a list of tuples
- fetchnumpy(self: duckdb.duckdb.DuckDBPyRelation) dict ¶
-
Execute and fetch all rows as a Python dict mapping each column to one numpy arrays
- fetchone(self: duckdb.duckdb.DuckDBPyRelation) Optional[tuple] ¶
-
Execute and fetch a single row as a tuple
- filter(self: duckdb.duckdb.DuckDBPyRelation, filter_expr: object) duckdb.duckdb.DuckDBPyRelation ¶
-
Filter the relation object by the filter in filter_expr
- first(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Returns the first value of a given column
- first_value(self: duckdb.duckdb.DuckDBPyRelation, column: str, window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Computes the first value within the group or partition
- fsum(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Computes the sum of all values present in a given column using a more accurate floating point summation (Kahan Sum)
- geomean(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Computes the geometric mean over all values present in a given column
- histogram(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Computes the histogram over all values present in a given column
- insert(self: duckdb.duckdb.DuckDBPyRelation, values: object) None ¶
-
Inserts the given values into the relation
- insert_into(self: duckdb.duckdb.DuckDBPyRelation, table_name: str) None ¶
-
Inserts the relation object into an existing table named table_name
- intersect(self: duckdb.duckdb.DuckDBPyRelation, other_rel: duckdb.duckdb.DuckDBPyRelation) duckdb.duckdb.DuckDBPyRelation ¶
-
Create the set intersection of this relation object with another relation object in other_rel
- join(self: duckdb.duckdb.DuckDBPyRelation, other_rel: duckdb.duckdb.DuckDBPyRelation, condition: object, how: str = 'inner') duckdb.duckdb.DuckDBPyRelation ¶
-
Join the relation object with another relation object in other_rel using the join condition expression in join_condition. Types supported are ‘inner’ and ‘left’
- lag(self: duckdb.duckdb.DuckDBPyRelation, column: str, window_spec: str, offset: int = 1, default_value: str = 'NULL', ignore_nulls: bool = False, projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Computes the lag within the partition
- last(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Returns the last value of a given column
- last_value(self: duckdb.duckdb.DuckDBPyRelation, column: str, window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Computes the last value within the group or partition
- lead(self: duckdb.duckdb.DuckDBPyRelation, column: str, window_spec: str, offset: int = 1, default_value: str = 'NULL', ignore_nulls: bool = False, projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Computes the lead within the partition
- limit(self: duckdb.duckdb.DuckDBPyRelation, n: int, offset: int = 0) duckdb.duckdb.DuckDBPyRelation ¶
-
Only retrieve the first n rows from this relation object, starting at offset
- list(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Returns a list containing all values present in a given column
- map(self: duckdb.duckdb.DuckDBPyRelation, map_function: Callable, *, schema: Optional[object] = None) duckdb.duckdb.DuckDBPyRelation ¶
-
Calls the passed function on the relation
- max(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Returns the maximum value present in a given column
- mean(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Computes the average on a given column
- median(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Computes the median over all values present in a given column
- min(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Returns the minimum value present in a given column
- mode(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Computes the mode over all values present in a given column
- n_tile(self: duckdb.duckdb.DuckDBPyRelation, window_spec: str, num_buckets: int, projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Divides the partition as equally as possible into num_buckets
- nth_value(self: duckdb.duckdb.DuckDBPyRelation, column: str, window_spec: str, offset: int, ignore_nulls: bool = False, projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Computes the nth value within the partition
- order(self: duckdb.duckdb.DuckDBPyRelation, order_expr: str) duckdb.duckdb.DuckDBPyRelation ¶
-
Reorder the relation object by order_expr
- percent_rank(self: duckdb.duckdb.DuckDBPyRelation, window_spec: str, projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Computes the relative rank within the partition
- pl(self: duckdb.duckdb.DuckDBPyRelation, batch_size: int = 1000000) duckdb::PolarsDataFrame ¶
-
Execute and fetch all rows as a Polars DataFrame
- product(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Returns the product of all values present in a given column
- project(self: duckdb.duckdb.DuckDBPyRelation, *args, groups: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Project the relation object by the projection in project_expr
- quantile(self: duckdb.duckdb.DuckDBPyRelation, column: str, q: object = 0.5, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Computes the exact quantile value for a given column
- quantile_cont(self: duckdb.duckdb.DuckDBPyRelation, column: str, q: object = 0.5, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Computes the interpolated quantile value for a given column
- quantile_disc(self: duckdb.duckdb.DuckDBPyRelation, column: str, q: object = 0.5, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Computes the exact quantile value for a given column
- query(self: duckdb.duckdb.DuckDBPyRelation, virtual_table_name: str, sql_query: str) duckdb.duckdb.DuckDBPyRelation ¶
-
Run the given SQL query in sql_query on the view named virtual_table_name that refers to the relation object
- rank(self: duckdb.duckdb.DuckDBPyRelation, window_spec: str, projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Computes the rank within the partition
- rank_dense(self: duckdb.duckdb.DuckDBPyRelation, window_spec: str, projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Computes the dense rank within the partition
- record_batch(self: duckdb.duckdb.DuckDBPyRelation, batch_size: int = 1000000) pyarrow.lib.RecordBatchReader ¶
-
Execute and return an Arrow Record Batch Reader that yields all rows
- row_number(self: duckdb.duckdb.DuckDBPyRelation, window_spec: str, projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Computes the row number within the partition
- select(self: duckdb.duckdb.DuckDBPyRelation, *args, groups: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Project the relation object by the projection in project_expr
- select_dtypes(self: duckdb.duckdb.DuckDBPyRelation, types: object) duckdb.duckdb.DuckDBPyRelation ¶
-
Select columns from the relation, by filtering based on type(s)
- select_types(self: duckdb.duckdb.DuckDBPyRelation, types: object) duckdb.duckdb.DuckDBPyRelation ¶
-
Select columns from the relation, by filtering based on type(s)
- set_alias(self: duckdb.duckdb.DuckDBPyRelation, alias: str) duckdb.duckdb.DuckDBPyRelation ¶
-
Rename the relation object to new alias
- property shape¶
-
Tuple of # of rows, # of columns in relation.
- show(self: duckdb.duckdb.DuckDBPyRelation, *, max_width: Optional[int] = None, max_rows: Optional[int] = None, max_col_width: Optional[int] = None, null_value: Optional[str] = None, render_mode: object = None) None ¶
-
Display a summary of the data
- sort(self: duckdb.duckdb.DuckDBPyRelation, *args) duckdb.duckdb.DuckDBPyRelation ¶
-
Reorder the relation object by the provided expressions
- sql_query(self: duckdb.duckdb.DuckDBPyRelation) str ¶
-
Get the SQL query that is equivalent to the relation
- std(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Computes the sample standard deviation for a given column
- stddev(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Computes the sample standard deviation for a given column
- stddev_pop(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Computes the population standard deviation for a given column
- stddev_samp(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Computes the sample standard deviation for a given column
- string_agg(self: duckdb.duckdb.DuckDBPyRelation, column: str, sep: str = ',', groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Concatenates the values present in a given column with a separator
- sum(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Computes the sum of all values present in a given column
- tf(self: duckdb.duckdb.DuckDBPyRelation) dict ¶
-
Fetch a result as dict of TensorFlow Tensors
- to_arrow_table(self: duckdb.duckdb.DuckDBPyRelation, batch_size: int = 1000000) pyarrow.lib.Table ¶
-
Execute and fetch all rows as an Arrow Table
- to_csv(self: duckdb.duckdb.DuckDBPyRelation, file_name: str, *, sep: object = None, na_rep: object = None, header: object = None, quotechar: object = None, escapechar: object = None, date_format: object = None, timestamp_format: object = None, quoting: object = None, encoding: object = None, compression: object = None, overwrite: object = None, per_thread_output: object = None, use_tmp_file: object = None, partition_by: object = None, write_partition_columns: object = None) None ¶
-
Write the relation object to a CSV file in ‘file_name’
- to_df(self: duckdb.duckdb.DuckDBPyRelation, *, date_as_object: bool = False) pandas.DataFrame ¶
-
Execute and fetch all rows as a pandas DataFrame
- to_parquet(self: duckdb.duckdb.DuckDBPyRelation, file_name: str, *, compression: object = None, field_ids: object = None, row_group_size_bytes: object = None, row_group_size: object = None) None ¶
-
Write the relation object to a Parquet file in ‘file_name’
- to_table(self: duckdb.duckdb.DuckDBPyRelation, table_name: str) None ¶
-
Creates a new table named table_name with the contents of the relation object
- to_view(self: duckdb.duckdb.DuckDBPyRelation, view_name: str, replace: bool = True) duckdb.duckdb.DuckDBPyRelation ¶
-
Creates a view named view_name that refers to the relation object
- torch(self: duckdb.duckdb.DuckDBPyRelation) dict ¶
-
Fetch a result as dict of PyTorch Tensors
- property type¶
-
Get the type of the relation.
- property types¶
-
Return a list containing the types of the columns of the relation.
- union(self: duckdb.duckdb.DuckDBPyRelation, union_rel: duckdb.duckdb.DuckDBPyRelation) duckdb.duckdb.DuckDBPyRelation ¶
-
Create the set union of this relation object with another relation object in other_rel
- unique(self: duckdb.duckdb.DuckDBPyRelation, unique_aggr: str) duckdb.duckdb.DuckDBPyRelation ¶
-
Number of distinct values in a column.
- value_counts(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Computes the number of elements present in a given column, also projecting the original column
- var(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Computes the sample variance for a given column
- var_pop(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Computes the population variance for a given column
- var_samp(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Computes the sample variance for a given column
- variance(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation ¶
-
Computes the sample variance for a given column
- write_csv(self: duckdb.duckdb.DuckDBPyRelation, file_name: str, *, sep: object = None, na_rep: object = None, header: object = None, quotechar: object = None, escapechar: object = None, date_format: object = None, timestamp_format: object = None, quoting: object = None, encoding: object = None, compression: object = None, overwrite: object = None, per_thread_output: object = None, use_tmp_file: object = None, partition_by: object = None, write_partition_columns: object = None) None ¶
-
Write the relation object to a CSV file in ‘file_name’
- write_parquet(self: duckdb.duckdb.DuckDBPyRelation, file_name: str, *, compression: object = None, field_ids: object = None, row_group_size_bytes: object = None, row_group_size: object = None) None ¶
-
Write the relation object to a Parquet file in ‘file_name’
- exception duckdb.Error¶
-
Bases:
Exception
- class duckdb.ExplainType¶
-
Bases:
pybind11_object
Members:
STANDARD
ANALYZE
- ANALYZE = <ExplainType.ANALYZE: 1>¶
- STANDARD = <ExplainType.STANDARD: 0>¶
- property name¶
- property value¶
- class duckdb.Expression¶
-
Bases:
pybind11_object
- alias(self: duckdb.duckdb.Expression, arg0: str) duckdb.duckdb.Expression ¶
-
Create a copy of this expression with the given alias.
- Parameters:
-
name: The alias to use for the expression, this will affect how it can be referenced.
- Returns:
-
Expression: self with an alias.
- asc(self: duckdb.duckdb.Expression) duckdb.duckdb.Expression ¶
-
Set the order by modifier to ASCENDING.
- cast(self: duckdb.duckdb.Expression, type: duckdb.duckdb.typing.DuckDBPyType) duckdb.duckdb.Expression ¶
-
Create a CastExpression to type from self
- Parameters:
-
type: The type to cast to
- Returns:
-
CastExpression: self::type
- desc(self: duckdb.duckdb.Expression) duckdb.duckdb.Expression ¶
-
Set the order by modifier to DESCENDING.
- isin(self: duckdb.duckdb.Expression, *args) duckdb.duckdb.Expression ¶
-
Return an IN expression comparing self to the input arguments.
- Returns:
-
DuckDBPyExpression: The compare IN expression
- isnotin(self: duckdb.duckdb.Expression, *args) duckdb.duckdb.Expression ¶
-
Return a NOT IN expression comparing self to the input arguments.
- Returns:
-
DuckDBPyExpression: The compare NOT IN expression
- isnotnull(self: duckdb.duckdb.Expression) duckdb.duckdb.Expression ¶
-
Create a binary IS NOT NULL expression from self
- Returns:
-
DuckDBPyExpression: self IS NOT NULL
- isnull(self: duckdb.duckdb.Expression) duckdb.duckdb.Expression ¶
-
Create a binary IS NULL expression from self
- Returns:
-
DuckDBPyExpression: self IS NULL
- nulls_first(self: duckdb.duckdb.Expression) duckdb.duckdb.Expression ¶
-
Set the NULL order by modifier to NULLS FIRST.
- nulls_last(self: duckdb.duckdb.Expression) duckdb.duckdb.Expression ¶
-
Set the NULL order by modifier to NULLS LAST.
- otherwise(self: duckdb.duckdb.Expression, value: duckdb.duckdb.Expression) duckdb.duckdb.Expression ¶
-
Add an ELSE <value> clause to the CaseExpression.
- Parameters:
-
value: The value to use if none of the WHEN conditions are met.
- Returns:
-
CaseExpression: self with an ELSE clause.
- show(self: duckdb.duckdb.Expression) None ¶
-
Print the stringified version of the expression.
- when(self: duckdb.duckdb.Expression, condition: duckdb.duckdb.Expression, value: duckdb.duckdb.Expression) duckdb.duckdb.Expression ¶
-
Add an additional WHEN <condition> THEN <value> clause to the CaseExpression.
- Parameters:
-
condition: The condition that must be met. value: The value to use if the condition is met.
- Returns:
-
CaseExpression: self with an additional WHEN clause.
- exception duckdb.FatalException¶
-
Bases:
DatabaseError
- duckdb.FunctionExpression(function_name: str, *args) duckdb.duckdb.Expression ¶
- exception duckdb.HTTPException¶
-
Bases:
IOException
Thrown when an error occurs in the httpfs extension, or whilst downloading an extension.
- body: str¶
- headers: Dict[str, str]¶
- reason: str¶
- status_code: int¶
- exception duckdb.IOException¶
-
Bases:
OperationalError
- exception duckdb.IntegrityError¶
-
Bases:
DatabaseError
- exception duckdb.InternalError¶
-
Bases:
DatabaseError
- exception duckdb.InternalException¶
-
Bases:
InternalError
- exception duckdb.InterruptException¶
-
Bases:
DatabaseError
- exception duckdb.InvalidInputException¶
-
Bases:
ProgrammingError
- exception duckdb.InvalidTypeException¶
-
Bases:
ProgrammingError
- exception duckdb.NotImplementedException¶
-
Bases:
NotSupportedError
- exception duckdb.NotSupportedError¶
-
Bases:
DatabaseError
- exception duckdb.OperationalError¶
-
Bases:
DatabaseError
- exception duckdb.OutOfMemoryException¶
-
Bases:
OperationalError
- exception duckdb.ParserException¶
-
Bases:
ProgrammingError
- exception duckdb.PermissionException¶
-
Bases:
DatabaseError
- exception duckdb.ProgrammingError¶
-
Bases:
DatabaseError
- class duckdb.PythonExceptionHandling¶
-
Bases:
pybind11_object
Members:
DEFAULT
RETURN_NULL
- DEFAULT = <PythonExceptionHandling.DEFAULT: 0>¶
- RETURN_NULL = <PythonExceptionHandling.RETURN_NULL: 1>¶
- property name¶
- property value¶
- exception duckdb.SequenceException¶
-
Bases:
DatabaseError
- exception duckdb.SerializationException¶
-
Bases:
OperationalError
- duckdb.StarExpression(*args, **kwargs)¶
-
Overloaded function.
StarExpression(*, exclude: object = None) -> duckdb.duckdb.Expression
StarExpression() -> duckdb.duckdb.Expression
- exception duckdb.SyntaxException¶
-
Bases:
ProgrammingError
- exception duckdb.TransactionException¶
-
Bases:
OperationalError
- class duckdb.Value(object: Any, type: DuckDBPyType)¶
-
Bases:
object
- exception duckdb.Warning¶
-
Bases:
Exception
- duckdb.aggregate(df: pandas.DataFrame, aggr_expr: object, group_expr: str = '', *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.DuckDBPyRelation ¶
-
Compute the aggregate aggr_expr by the optional groups group_expr on the relation
- duckdb.alias(df: pandas.DataFrame, alias: str, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.DuckDBPyRelation ¶
-
Rename the relation object to new alias
- duckdb.append(table_name: str, df: pandas.DataFrame, *, by_name: bool = False, connection: duckdb.DuckDBPyConnection = None) duckdb.DuckDBPyConnection ¶
-
Append the passed DataFrame to the named table
- duckdb.array_type(type: duckdb.duckdb.typing.DuckDBPyType, size: int, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.typing.DuckDBPyType ¶
-
Create an array type object of ‘type’
- duckdb.arrow(*args, **kwargs)¶
-
Overloaded function.
arrow(rows_per_batch: int = 1000000, *, connection: duckdb.DuckDBPyConnection = None) -> pyarrow.lib.Table
Fetch a result as Arrow table following execute()
arrow(rows_per_batch: int = 1000000, *, connection: duckdb.DuckDBPyConnection = None) -> pyarrow.lib.Table
Fetch a result as Arrow table following execute()
arrow(arrow_object: object, *, connection: duckdb.DuckDBPyConnection = None) -> duckdb.duckdb.DuckDBPyRelation
Create a relation object from an Arrow object
- duckdb.begin(*, connection: duckdb.DuckDBPyConnection = None) duckdb.DuckDBPyConnection ¶
-
Start a new transaction
- duckdb.checkpoint(*, connection: duckdb.DuckDBPyConnection = None) duckdb.DuckDBPyConnection ¶
-
Synchronizes data in the write-ahead log (WAL) to the database data file (no-op for in-memory connections)
- duckdb.close(*, connection: duckdb.DuckDBPyConnection = None) None ¶
-
Close the connection
- duckdb.commit(*, connection: duckdb.DuckDBPyConnection = None) duckdb.DuckDBPyConnection ¶
-
Commit changes performed within a transaction
- duckdb.connect(database: object = ':memory:', read_only: bool = False, config: dict = None) duckdb.DuckDBPyConnection ¶
-
Create a DuckDB database instance. Can take a database file name to read/write persistent data and a read_only flag if no changes are desired
- duckdb.create_function(name: str, function: Callable, parameters: object = None, return_type: duckdb.duckdb.typing.DuckDBPyType = None, *, type: duckdb.duckdb.functional.PythonUDFType = <PythonUDFType.NATIVE: 0>, null_handling: duckdb.duckdb.functional.FunctionNullHandling = <FunctionNullHandling.DEFAULT: 0>, exception_handling: duckdb.duckdb.PythonExceptionHandling = <PythonExceptionHandling.DEFAULT: 0>, side_effects: bool = False, connection: duckdb.DuckDBPyConnection = None) duckdb.DuckDBPyConnection ¶
-
Create a DuckDB function out of the passing in Python function so it can be used in queries
- duckdb.cursor(*, connection: duckdb.DuckDBPyConnection = None) duckdb.DuckDBPyConnection ¶
-
Create a duplicate of the current connection
- duckdb.decimal_type(width: int, scale: int, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.typing.DuckDBPyType ¶
-
Create a decimal type with ‘width’ and ‘scale’
- duckdb.description(*, connection: duckdb.DuckDBPyConnection = None) Optional[list] ¶
-
Get result set attributes, mainly column names
- duckdb.df(*args, **kwargs)¶
-
Overloaded function.
df(*, date_as_object: bool = False, connection: duckdb.DuckDBPyConnection = None) -> pandas.DataFrame
Fetch a result as DataFrame following execute()
df(*, date_as_object: bool = False, connection: duckdb.DuckDBPyConnection = None) -> pandas.DataFrame
Fetch a result as DataFrame following execute()
df(df: pandas.DataFrame, *, connection: duckdb.DuckDBPyConnection = None) -> duckdb.duckdb.DuckDBPyRelation
Create a relation object from the DataFrame df
- duckdb.distinct(df: pandas.DataFrame, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.DuckDBPyRelation ¶
-
Retrieve distinct rows from this relation object
- duckdb.dtype(type_str: str, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.typing.DuckDBPyType ¶
-
Create a type object by parsing the ‘type_str’ string
- duckdb.duplicate(*, connection: duckdb.DuckDBPyConnection = None) duckdb.DuckDBPyConnection ¶
-
Create a duplicate of the current connection
- duckdb.enum_type(name: str, type: duckdb.duckdb.typing.DuckDBPyType, values: list, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.typing.DuckDBPyType ¶
-
Create an enum type of underlying ‘type’, consisting of the list of ‘values’
- duckdb.execute(query: object, parameters: object = None, *, connection: duckdb.DuckDBPyConnection = None) duckdb.DuckDBPyConnection ¶
-
Execute the given SQL query, optionally using prepared statements with parameters set
- duckdb.executemany(query: object, parameters: object = None, *, connection: duckdb.DuckDBPyConnection = None) duckdb.DuckDBPyConnection ¶
-
Execute the given prepared statement multiple times using the list of parameter sets in parameters
- duckdb.extract_statements(query: str, *, connection: duckdb.DuckDBPyConnection = None) list ¶
-
Parse the query string and extract the Statement object(s) produced
- duckdb.fetch_arrow_table(rows_per_batch: int = 1000000, *, connection: duckdb.DuckDBPyConnection = None) pyarrow.lib.Table ¶
-
Fetch a result as Arrow table following execute()
- duckdb.fetch_df(*, date_as_object: bool = False, connection: duckdb.DuckDBPyConnection = None) pandas.DataFrame ¶
-
Fetch a result as DataFrame following execute()
- duckdb.fetch_df_chunk(vectors_per_chunk: int = 1, *, date_as_object: bool = False, connection: duckdb.DuckDBPyConnection = None) pandas.DataFrame ¶
-
Fetch a chunk of the result as DataFrame following execute()
- duckdb.fetch_record_batch(rows_per_batch: int = 1000000, *, connection: duckdb.DuckDBPyConnection = None) pyarrow.lib.RecordBatchReader ¶
-
Fetch an Arrow RecordBatchReader following execute()
- duckdb.fetchall(*, connection: duckdb.DuckDBPyConnection = None) list ¶
-
Fetch all rows from a result following execute
- duckdb.fetchdf(*, date_as_object: bool = False, connection: duckdb.DuckDBPyConnection = None) pandas.DataFrame ¶
-
Fetch a result as DataFrame following execute()
- duckdb.fetchmany(size: int = 1, *, connection: duckdb.DuckDBPyConnection = None) list ¶
-
Fetch the next set of rows from a result following execute
- duckdb.fetchnumpy(*, connection: duckdb.DuckDBPyConnection = None) dict ¶
-
Fetch a result as list of NumPy arrays following execute
- duckdb.fetchone(*, connection: duckdb.DuckDBPyConnection = None) Optional[tuple] ¶
-
Fetch a single row from a result following execute
- duckdb.filesystem_is_registered(name: str, *, connection: duckdb.DuckDBPyConnection = None) bool ¶
-
Check if a filesystem with the provided name is currently registered
- duckdb.filter(df: pandas.DataFrame, filter_expr: object, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.DuckDBPyRelation ¶
-
Filter the relation object by the filter in filter_expr
- duckdb.from_arrow(arrow_object: object, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.DuckDBPyRelation ¶
-
Create a relation object from an Arrow object
- duckdb.from_csv_auto(path_or_buffer: object, **kwargs) duckdb.duckdb.DuckDBPyRelation ¶
-
Create a relation object from the CSV file in ‘name’
- duckdb.from_df(df: pandas.DataFrame, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.DuckDBPyRelation ¶
-
Create a relation object from the DataFrame in df
- duckdb.from_parquet(*args, **kwargs)¶
-
Overloaded function.
from_parquet(file_glob: str, binary_as_string: bool = False, *, file_row_number: bool = False, filename: bool = False, hive_partitioning: bool = False, union_by_name: bool = False, compression: object = None, connection: duckdb.DuckDBPyConnection = None) -> duckdb.duckdb.DuckDBPyRelation
Create a relation object from the Parquet files in file_glob
from_parquet(file_globs: list[str], binary_as_string: bool = False, *, file_row_number: bool = False, filename: bool = False, hive_partitioning: bool = False, union_by_name: bool = False, compression: object = None, connection: duckdb.DuckDBPyConnection = None) -> duckdb.duckdb.DuckDBPyRelation
Create a relation object from the Parquet files in file_globs
- duckdb.from_query(query: object, *, alias: str = '', params: object = None, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.DuckDBPyRelation ¶
-
Run a SQL query. If it is a SELECT statement, create a relation object from the given SQL query, otherwise run the query as-is.
- duckdb.from_substrait(proto: bytes, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.DuckDBPyRelation ¶
-
Create a query object from protobuf plan
- duckdb.from_substrait_json(json: str, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.DuckDBPyRelation ¶
-
Create a query object from a JSON protobuf plan
- duckdb.get_substrait(query: str, *, enable_optimizer: bool = True, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.DuckDBPyRelation ¶
-
Serialize a query to protobuf
- duckdb.get_substrait_json(query: str, *, enable_optimizer: bool = True, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.DuckDBPyRelation ¶
-
Serialize a query to protobuf on the JSON format
- duckdb.get_table_names(query: str, *, connection: duckdb.DuckDBPyConnection = None) set[str] ¶
-
Extract the required table names from a query
- duckdb.install_extension(extension: str, *, force_install: bool = False, repository: object = None, repository_url: object = None, version: object = None, connection: duckdb.DuckDBPyConnection = None) None ¶
-
Install an extension by name, with an optional version and/or repository to get the extension from
- duckdb.interrupt(*, connection: duckdb.DuckDBPyConnection = None) None ¶
-
Interrupt pending operations
- duckdb.limit(df: pandas.DataFrame, n: int, offset: int = 0, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.DuckDBPyRelation ¶
-
Only retrieve the first n rows from this relation object, starting at offset
- duckdb.list_filesystems(*, connection: duckdb.DuckDBPyConnection = None) list ¶
-
List registered filesystems, including builtin ones
- duckdb.list_type(type: duckdb.duckdb.typing.DuckDBPyType, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.typing.DuckDBPyType ¶
-
Create a list type object of ‘type’
- duckdb.load_extension(extension: str, *, connection: duckdb.DuckDBPyConnection = None) None ¶
-
Load an installed extension
- duckdb.map_type(key: duckdb.duckdb.typing.DuckDBPyType, value: duckdb.duckdb.typing.DuckDBPyType, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.typing.DuckDBPyType ¶
-
Create a map type object from ‘key_type’ and ‘value_type’
- duckdb.order(df: pandas.DataFrame, order_expr: str, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.DuckDBPyRelation ¶
-
Reorder the relation object by order_expr
- duckdb.pl(rows_per_batch: int = 1000000, *, connection: duckdb.DuckDBPyConnection = None) duckdb::PolarsDataFrame ¶
-
Fetch a result as Polars DataFrame following execute()
- duckdb.project(df: pandas.DataFrame, *args, groups: str = '', connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.DuckDBPyRelation ¶
-
Project the relation object by the projection in project_expr
- duckdb.query(query: object, *, alias: str = '', params: object = None, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.DuckDBPyRelation ¶
-
Run a SQL query. If it is a SELECT statement, create a relation object from the given SQL query, otherwise run the query as-is.
- duckdb.query_df(df: pandas.DataFrame, virtual_table_name: str, sql_query: str, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.DuckDBPyRelation ¶
-
Run the given SQL query in sql_query on the view named virtual_table_name that refers to the relation object
- duckdb.read_csv(path_or_buffer: object, **kwargs) duckdb.duckdb.DuckDBPyRelation ¶
-
Create a relation object from the CSV file in ‘name’
- duckdb.read_json(path_or_buffer: object, *, columns: Optional[object] = None, sample_size: Optional[object] = None, maximum_depth: Optional[object] = None, records: Optional[str] = None, format: Optional[str] = None, date_format: Optional[object] = None, timestamp_format: Optional[object] = None, compression: Optional[object] = None, maximum_object_size: Optional[object] = None, ignore_errors: Optional[object] = None, convert_strings_to_integers: Optional[object] = None, field_appearance_threshold: Optional[object] = None, map_inference_threshold: Optional[object] = None, maximum_sample_files: Optional[object] = None, filename: Optional[object] = None, hive_partitioning: Optional[object] = None, union_by_name: Optional[object] = None, hive_types: Optional[object] = None, hive_types_autocast: Optional[object] = None, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.DuckDBPyRelation ¶
-
Create a relation object from the JSON file in ‘name’
- duckdb.read_parquet(*args, **kwargs)¶
-
Overloaded function.
read_parquet(file_glob: str, binary_as_string: bool = False, *, file_row_number: bool = False, filename: bool = False, hive_partitioning: bool = False, union_by_name: bool = False, compression: object = None, connection: duckdb.DuckDBPyConnection = None) -> duckdb.duckdb.DuckDBPyRelation
Create a relation object from the Parquet files in file_glob
read_parquet(file_globs: list[str], binary_as_string: bool = False, *, file_row_number: bool = False, filename: bool = False, hive_partitioning: bool = False, union_by_name: bool = False, compression: object = None, connection: duckdb.DuckDBPyConnection = None) -> duckdb.duckdb.DuckDBPyRelation
Create a relation object from the Parquet files in file_globs
- duckdb.register(view_name: str, python_object: object, *, connection: duckdb.DuckDBPyConnection = None) duckdb.DuckDBPyConnection ¶
-
Register the passed Python Object value for querying with a view
- duckdb.register_filesystem(filesystem: fsspec.AbstractFileSystem, *, connection: duckdb.DuckDBPyConnection = None) None ¶
-
Register a fsspec compliant filesystem
- duckdb.remove_function(name: str, *, connection: duckdb.DuckDBPyConnection = None) duckdb.DuckDBPyConnection ¶
-
Remove a previously created function
- duckdb.rollback(*, connection: duckdb.DuckDBPyConnection = None) duckdb.DuckDBPyConnection ¶
-
Roll back changes performed within a transaction
- duckdb.row_type(fields: object, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.typing.DuckDBPyType ¶
-
Create a struct type object from ‘fields’
- duckdb.rowcount(*, connection: duckdb.DuckDBPyConnection = None) int ¶
-
Get result set row count
- duckdb.sql(query: object, *, alias: str = '', params: object = None, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.DuckDBPyRelation ¶
-
Run a SQL query. If it is a SELECT statement, create a relation object from the given SQL query, otherwise run the query as-is.
- duckdb.sqltype(type_str: str, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.typing.DuckDBPyType ¶
-
Create a type object by parsing the ‘type_str’ string
- duckdb.string_type(collation: str = '', *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.typing.DuckDBPyType ¶
-
Create a string type with an optional collation
- duckdb.struct_type(fields: object, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.typing.DuckDBPyType ¶
-
Create a struct type object from ‘fields’
- duckdb.table(table_name: str, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.DuckDBPyRelation ¶
-
Create a relation object for the named table
- duckdb.table_function(name: str, parameters: object = None, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.DuckDBPyRelation ¶
-
Create a relation object from the named table function with given parameters
- duckdb.tf(*, connection: duckdb.DuckDBPyConnection = None) dict ¶
-
Fetch a result as dict of TensorFlow Tensors following execute()
- class duckdb.token_type¶
-
Bases:
pybind11_object
Members:
identifier
numeric_const
string_const
operator
keyword
comment
- comment = <token_type.comment: 5>¶
- identifier = <token_type.identifier: 0>¶
- keyword = <token_type.keyword: 4>¶
- property name¶
- numeric_const = <token_type.numeric_const: 1>¶
- operator = <token_type.operator: 3>¶
- string_const = <token_type.string_const: 2>¶
- property value¶
- duckdb.tokenize(query: str) list ¶
-
Tokenizes a SQL string, returning a list of (position, type) tuples that can be used for e.g., syntax highlighting
- duckdb.torch(*, connection: duckdb.DuckDBPyConnection = None) dict ¶
-
Fetch a result as dict of PyTorch Tensors following execute()
- duckdb.type(type_str: str, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.typing.DuckDBPyType ¶
-
Create a type object by parsing the ‘type_str’ string
- duckdb.union_type(members: object, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.typing.DuckDBPyType ¶
-
Create a union type object from ‘members’
- duckdb.unregister(view_name: str, *, connection: duckdb.DuckDBPyConnection = None) duckdb.DuckDBPyConnection ¶
-
Unregister the view name
- duckdb.unregister_filesystem(name: str, *, connection: duckdb.DuckDBPyConnection = None) None ¶
-
Unregister a filesystem
- duckdb.values(values: object, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.DuckDBPyRelation ¶
-
Create a relation object from the passed values
- duckdb.view(view_name: str, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.DuckDBPyRelation ¶
-
Create a relation object for the named view
- duckdb.write_csv(df: pandas.DataFrame, filename: str, *, sep: object = None, na_rep: object = None, header: object = None, quotechar: object = None, escapechar: object = None, date_format: object = None, timestamp_format: object = None, quoting: object = None, encoding: object = None, compression: object = None, overwrite: object = None, per_thread_output: object = None, use_tmp_file: object = None, partition_by: object = None, write_partition_columns: object = None, connection: duckdb.DuckDBPyConnection = None) None ¶
-
Write the relation object to a CSV file in ‘file_name’