COPY
moves data between DuckDB tables and external Comma Separated Value (CSV) files.
CSV Import
COPY ... FROM
imports data into DuckDB from an external CSV file into an existing table. The data is appended to whatever data is in the table already. The amount of columns inside the file must match the amount of columns in the table table_name
, and the contents of the columns must be convertible to the column types of the table. In case this is not possible, an error will be thrown.
If a list of columns is specified, COPY
will only copy the data in the specified columns from the file. If there are any columns in the table that are not in the column list, COPY ... FROM
will insert the default values for those columns
-- Copy the contents of a comma-separated file 'test.csv' without a header into the table 'test'
COPY test FROM 'test.csv';
-- Copy the contents of a comma-separated file with a header into the 'category' table
COPY category FROM 'categories.csv' ( HEADER );
-- Copy the contents of 'lineitem.tbl' into the 'lineitem' table, where the contents are delimited by a pipe character ('|')
COPY lineitem FROM 'lineitem.tbl' ( DELIMITER '|' );
-- Copy the contents of 'lineitem.tbl' into the 'lineitem' table, where the delimiter, quote character, and presence of a header are automatically detected
COPY lineitem FROM 'lineitem.tbl' ( AUTO_DETECT TRUE );
-- Read the contents of a comma-separated file 'names.csv' into the 'name' column of the 'category' table. Any other columns of this table are filled with their default value.
COPY category(name) FROM 'names.csv';
Syntax
CSV Export
COPY ... TO
exports data from DuckDB to an external CSV file. It has mostly the same set of options as COPY ... FROM
, however, in the case of COPY ... TO
the options specify how the CSV file should be written to disk. Any CSV file created by COPY ... TO
can be copied back into the database by using COPY ... FROM
with the same set of options.
The COPY ... TO
function can be called specifying either a table name, or a query. When a table name is specified, the contents of the entire table will be written into the resulting CSV file. When a query is specified, the query is executed and the result of the query is written to the resulting file.
-- Copy the contents of the 'lineitem' table to the file 'lineitem.tbl', where the columns are delimited by a pipe character ('|'), including a header line.
COPY lineitem TO 'lineitem.tbl' ( DELIMITER '|', HEADER );
-- Copy the l_orderkey column of the 'lineitem' table to the file 'orderkey.tbl'
COPY lineitem(l_orderkey) TO 'orderkey.tbl' ( DELIMITER '|' );
-- Copy the result of a query to the file 'query.csv', including a header with column names
COPY (SELECT 42 AS a, 'hello' AS b) TO 'query.csv' WITH (HEADER 1, DELIMITER ',');
Syntax
Parameters
Name | Description |
---|---|
table_name |
The name (optionally schema-qualified) of an existing table. |
column_name |
An optional list of columns to be copied. If no column list is specified, all columns of the table will be copied. |
filename |
The path and name of the input or output file. |
boolean |
Specifies whether the selected option should be turned on or off. You can write TRUE , ON , or 1 to enable the option, and FALSE , OFF , or 0 to disable it. The boolean value can also be omitted, in which case TRUE is assumed. |
FORMAT |
Specifies the copy function to use. This defaults to CSV , but other options can be available (e.g. PARQUET ). |
DELIMITER |
Specifies the string that separates columns within each row (line) of the file. The default value is a comma (, ). |
NULL |
Specifies the string that represents a NULL value. The default is an empty string. Please note that in COPY ... FROM both an unquoted empty string and a quoted empty string represent a NULL value. If any other NULL string is specified, again both its quoted and its unquoted appearance represent a NULL value. COPY ... TO does not quote NULL values on output, even if FORCE_QUOTE is true. |
HEADER |
Specifies that the file contains a header line with the names of each column in the file. Thus, COPY ... FROM ignores the first line when importing data, whereas on output (COPY ... TO ) the first line of the file contains the column names of the exported columns. |
QUOTE |
Specifies the quoting string to be used when a data value is quoted. The default is double-quote (" ). |
ESCAPE |
Specifies the string that should appear before a data character sequence that matches the QUOTE value. The default is the same as the QUOTE value (so that the quoting string is doubled if it appears in the data). |
DATEFORMAT |
Specifies the date format to use when parsing dates. See Date Format |
TIMESTAMPFORMAT |
Specifies the date format to use when parsing timestamps. See Date Format |
FORCE_QUOTE |
Forces quoting to be used for all non-NULL values in each specified column. NULL output is never quoted. If * is specified, non-NULL values will be quoted in all columns. This option is allowed only in COPY ... TO . |
FORCE_NOT_NULL |
Do not match the specified columns’ values against the NULL string. In the default case where the NULL string is empty, this means that empty values will be read as zero-length strings rather than NULLs. This option is allowed only in COPY ... FROM . |
ENCODING |
If this option is used, its value must be UTF8 . With any other encoding an error will be thrown. |
AUTO_DETECT |
Option for CSV parsing. If TRUE , the parser will attempt to detect the input format and data types automatically. DELIM /SEP , QUOTE , ESCAPE , and HEADER parameters become optional. |
SAMPLE_SIZE |
Option to define number of sample rows for automatic CSV type detection. Chunks of sample rows will be drawn from different locations of the input file. Set to -1 to scan the entire input file. Only the first max. 1024 rows will be used for dialect detection. |
ALL_VARCHAR |
Option to skip type detection for CSV parsing and assume all columns to be of type VARCHAR. |
It is recommended that the file name used in
COPY
always be specified as an absolute path.The values in each record are separated by the
DELIMITER
string. If the value contains theDELIMITER
string, theQUOTE
string, theNULL
string, a carriage return, or line feed character, then the whole value is prefixed and suffixed by theQUOTE
string, and any occurrence within the value of aQUOTE
string or theESCAPE
string is preceded by theESCAPE
string. You can also useFORCE_QUOTE
to force quotes when outputting non-NULL values in specific columns.The CSV format has no standard way to distinguish a NULL value from an empty string. You can use
FORCE_NOT_NULL
to preventNULL
input comparisons for specific columns.In CSV format, all characters are significant. A quoted value surrounded by white space, or any characters other than
DELIMITER
, will include those characters. This can cause errors if you import data from a system that pads CSV lines with white space out to some fixed width. If such a situation arises you might need to preprocess the CSV file to remove the trailing white space, before importing the data into DuckDB.