Announcing DuckDB 1.2.0

The DuckDB team
Published on 2025-02-05

TL;DR: The DuckDB team is happy to announce that today we're releasing DuckDB version 1.2.0, codenamed “Histrionicus”.

To install the new version, please visit the installation guide. For the release notes, see the release page.

Some packages (Go, R, Java) take a few extra days to release due to the reviews required in the release pipelines.

We are proud to release DuckDB 1.2.0. This release is codenamed “Histrionicus” after the good-looking Harlequin duck (Histrionicus histrionicus), that inhabits "cold fast moving streams in North America, Greenland, Iceland and eastern Russia".

What's New in 1.2.0

There have been far too many changes to discuss them each in detail, but we would like to highlight several particularly important and exciting features! Below is a summary of those new features with examples.

Breaking Changes

The random function now uses a larger state. This means that it's even more random™ now. Due to this change fixed seeds will now produce different values than in the previous versions of DuckDB.

map['entry'] now returns a value, instead of a list of entries. For example, map(['k'], ['v'])['k'] now returns 'v', while previously it returned ['v']. We also introduced the map_extract_value function, which is now the alias for the bracket operator []. If you would like to return a list, use the map_extract function: map_extract(map(['k'], ['v']), 'k') = ['v'].

The indexing of list_reduce is fixed. When indexing is applied in list_reduce, the index points to the last parameter of the lambda function and indexing starts from 1. Therefore, list_reduce(['a', 'b'], (x, y, i) -> x || y || i) returns ab2.

Explicit Storage Versions

DuckDB v1.2.0 ships new compression methods but they are not yet enabled by default to ensure that older DuckDB versions can read files produced by DuckDB v1.2.0.

In practice, this means that DuckDB v1.2.0 can read database files written by past stable DuckDB versions such as v1.0.0. When using DuckDB v1.2.0 with default settings, older versions can read files written by DuckDB v1.2.0.

You can opt-in to newer forwards-incompatible features using the following syntax:

ATTACH 'file.db' (STORAGE_VERSION 'v1.2.0');

This setting specifies the minimum DuckDB version that should be able to read the database file. When database files are written with this option, the resulting files cannot be opened by older DuckDB released versions than the specified version. They can be read by the specified version and all newer versions of DuckDB.

If you attach to DuckDB databases, you can query the storage versions using the following command:

SELECT database_name, tags FROM duckdb_databases();

This shows the storage versions:

┌───────────────┬───────────────────────────────────┐
│ database_name │               tags                │
│    varchar    │       map(varchar, varchar)       │
├───────────────┼───────────────────────────────────┤
│ file1         │ {storage_version=v1.2.0}          │
│ file2         │ {storage_version=v1.0.0 - v1.1.3} │
│ ...           │ ...                               │
└───────────────┴───────────────────────────────────┘

This means that file2 can be opened by past DuckDB versions while file1 is compatible only with v1.2.0 (or future versions).

To convert from the new format to the old format for compatibility, use the following sequence in DuckDB v1.2.0:

ATTACH 'file1.db';
ATTACH 'converted_file.db' (STORAGE_VERSION 'v1.0.0');
COPY FROM DATABASE file1 TO converted_file;

Indexing

ALTER TABLE ... ADD PRIMARY KEY. After a long while, DuckDB is finally able to add a primary key to an existing table 🎉. So it is now possible to run this:

CREATE TABLE tbl(id INTEGER);
INSERT INTO tbl VALUES (42);
ALTER TABLE tbl ADD PRIMARY KEY (id);

Over-eager constraint checking addressed. We also resolved a long-standing issue with over-eager unique constraint checking. For example, the following sequence of commands used to throw an error but now works:

CREATE TABLE students (id INTEGER PRIMARY KEY, name VARCHAR);
INSERT INTO students VALUES (1, 'John Doe');

BEGIN; -- start transaction
DELETE FROM students WHERE id = 1;
INSERT INTO students VALUES (1, 'Jane Doe');

CSV Features

Latin-1 and UTF-16 encodings. Previously, DuckDB's CSV reader was limited to UTF-8 files. It can now read Latin-1 and UTF-16 files. For example:

FROM read_csv('cities-latin-1.csv', encoding = 'latin-1');

Multi-byte delimiters. DuckDB now supports delimiters of up to 4 bytes. This means that you can finally use the duck emoji as a column delimiter. For example:

a🦆b
hello🦆world
FROM read_csv('example.dsv', sep = '🦆');

Strict CSV parsing. The RFC 4180 specification defines requirements for well-formed CSV files, e.g., having a single line delimiter. By default, DuckDB now parses CSVs in so-called strict mode (`strict_mode = true). For example, the following CSV file gets rejected because of mixed newline characters:

echo "a,b\r\nhello,42\nworld,84" > rfc_4180-defiant.csv
FROM read_csv('rfc_4180-defiant.csv');
Invalid Input Error:
Error when sniffing file "rfc_4180-defiant.csv".
It was not possible to automatically detect the CSV Parsing dialect/types

But it's parsed with the more lenient option strict_mode = false:

FROM read_csv('rfc_4180-defiant.csv', strict_mode = false);
┌─────────┬───────┐
│    a    │   b   │
│ varchar │ int64 │
├─────────┼───────┤
│ hello   │    42 │
│ world   │    84 │
└─────────┴───────┘

Performance improvements. The CSV parser in the new release uses a new algorithm to find a new line on parallel execution. This leads to speedups of around 15%.

Unlimited row length. Previously, DuckDB was limited to CSV files with rows of up to 8 MB. The new version lifts this restriction, and lines can be of arbitrary length.

Parquet Features

Parquet dictionary and Bloom filter support. DuckDB now supports writing many more types using dictionary encoding. This should reduce file size in some cases. DuckDB is now also able to read and write Parquet Bloom filters. Bloom filters are small indexing data structures that can be used to exclude row groups if a filter is set. This is particularly useful for often-repeated but unordered data (e.g., categorical values). A separate blog post will follow.

Delta binary packed compression for Parquet. DuckDB now supports the DELTA_BINARY_PACKED compression as well as the DELTA_LENGTH_BYTE_ARRAY and BYTE_STREAM_SPLIT option for Parquet files. A few weeks ago, we elaborated on these in a blog post.

CLI Improvements

Safe mode. The DuckDB command line client now supports safe mode, which can be activated with the -safe flag or the .safe_mode dot command. In this mode, the CLI client is prevented from accessing external files other than the database file that it was initially connected to and prevented from interacting with the host file system. For more information, see the Securing DuckDB page in the Operations Manual.

Better autocomplete. The autocomplete in CLI now uses a Parsing Expression Grammar (PEG) for better autocomplete, as well as improved error messages and suggestions.

Pretty-printing large numbers. The CLI provides a summary of the number printed if the client is only rendering only a single row.

SELECT 100_000_000 AS x, pi() * 1e9 AS y;
┌──────────────────┬───────────────────┐
│        x         │         y         │
│      int32       │      double       │
├──────────────────┼───────────────────┤
│    100000000     │ 3141592653.589793 │
│ (100.00 million) │  (3.14 billion)   │
└──────────────────┴───────────────────┘

Friendly SQL

Prefix aliases. SQL Expression and table aliases can now be specified before the thing they are referring to (instead of using the well-known syntax of using ASs). This can improve readability in some cases, for example:

SELECT 
    e1: some_long_and_winding_expression,
    e2: t2.a_column_name 
FROM
    t1: long_schema.some_long_table_name,
    t2: short_s.tbl;

Credit for this idea goes to Michael Toy. A separate blog post will follow soon.

RENAME clause. DuckDB now supports the RENAME clause in SELECT. This allows renaming fields emitted by the * expression:

CREATE TABLE integers(col1 INT, col2 INT);
INSERT INTO integers VALUES (42, 84);
SELECT * RENAME (col1 AS new_col1) FROM integers;

Star LIKE. The LIKE and SIMILAR TO clauses can now be used on * expressions as a short-hand for the COLUMNS syntax.

CREATE TABLE key_val(key VARCHAR, val1 INT, val2 INT);
INSERT INTO key_val VALUES ('v', 42, 84);
SELECT * LIKE 'val%' FROM key_val;
┌───────┬───────┐
│ val1  │ val2  │
│ int32 │ int32 │
├───────┼───────┤
│  42   │  84   │
└───────┴───────┘

Optimizations

We have spent a lot of time on DuckDB's optimizer. It is hard to quantify optimizer improvements, but as a result of these optimizations, DuckDB for example achieves a 13% improvement on the total runtime of TPC-H SF100 queries when run on a MacBook Pro over the previous release.

C API for Extensions

Currently, DuckDB extensions use DuckDB’s internal C++ structures. This – along with some fun linking issues – requires a lock-step development of extensions with mainline DuckDB and constant updates. Starting with this release, we expose a new C-style API for extensions in duckdb_extension.h. This API can be used to create for example scalar, aggregate or table functions in DuckDB. There are two main advantages of using this API: first, many programming languages (e.g., Go, Rust and even Java) have direct bindings to C APIs, making it rather easy to integrate. Secondly, the C Extension API is stable and backwards-compatible, meaning that extensions that target this API will keep working for new DuckDB versions. We will follow up with a new extension template.

musl Extensions

Distributing extensions for musl. The musl C library is often used in lightweight setups such as Docker setups running Alpine Linux. Starting with this release, we officially support musl and we distribute extensions for the linux_amd64_musl platform (but not yet for linux_arm64_musl).

Final Thoughts

These were a few highlights – but there are many more features and improvements in this release. There have been over 5 000 commits by over 70 contributors since we released 1.1.3. The full – very long – release notes can be found on GitHub.

We would like to thank again our amazing community for using DuckDB, building cool projects on DuckDB and improving DuckDB by providing us feedback. Your contributions truly mean a lot!