Announcing DuckDB 1.4.2 LTS
TL;DR: Today we are releasing DuckDB 1.4.2, the second patch release of our LTS edition. The new release ships several bugfixes and performance optimizations as well as some new Iceberg and logger/profiler features. We also fixed vulnerabilities in DuckDB's database encryption.
In this blog post, we highlight a few important fixes and convenience improvements in DuckDB v1.4.2, the second patch release in DuckDB's 1.4 LTS line. To see the complete list of updates, please consult the release notes on GitHub.
To install the new version, please visit the installation page. Note that it can take a few hours to days for some client libraries (e.g., R, Rust) to be released due to the extra changes and review rounds required.
Features and Improvements
Iceberg Improvements
Similarly to the v1.4.1 release blog post, we can start with some good news for our Iceberg users: DuckDB v1.4.2 ships a number of improvements for the iceberg extension. Insert, update, and delete statements are all supported now:
Click to see the SQL code sample for Iceberg updates.
ATTACH 'warehouse_name' AS iceberg_catalog (
TYPE iceberg,
other options
);
CREATE TABLE iceberg_catalog.default.simple_table
(col1 INTEGER, col2 VARCHAR);
INSERT INTO iceberg_catalog.default.simple_table
VALUES (1, 'hello'), (2, 'world'), (3, 'duckdb is great');
DELETE FROM iceberg_catalog.default.simple_table
WHERE col1 = 2;
UPDATE iceberg_catalog.default.simple_table
SET col1 = col1 + 5
WHERE col1 = 1;
We will publish a separate blog post on these improvements shortly. Stay tuned!
Logger and Profiler Improvements
Time HTTP Requests
The logger can now log the time of HTTP requests (#19691).
For example, if we query the Dutch railway tariffs table as a Parquet file (tariffs.parquet),
we can see multiple HTTP requests: a HEAD request and three GET requests:
CALL enable_logging('HTTP');
CREATE TABLE railway_tariffs AS
FROM 'https://blobs.duckdb.org/tariffs.parquet';
SELECT request.type, request.url, request.duration_ms
FROM duckdb_logs_parsed('HTTP');
┌─────────┬──────────────────────────────────────────┬─────────────┐
│ type │ url │ duration_ms │
│ varchar │ varchar │ int64 │
├─────────┼──────────────────────────────────────────┼─────────────┤
│ HEAD │ https://blobs.duckdb.org/tariffs.parquet │ 177 │
│ GET │ https://blobs.duckdb.org/tariffs.parquet │ 103 │
│ GET │ https://blobs.duckdb.org/tariffs.parquet │ 176 │
│ GET │ https://blobs.duckdb.org/tariffs.parquet │ 182 │
└─────────┴──────────────────────────────────────────┴─────────────┘
Accessing the Profiler's Output from the Logger
The logger can now also access the profiler's output (#19572).
This means that if both the profiler and the logger are enabled, you can log information such as the execution time of queries:
-- Enable profiling to JSON file
-- This is necessary to make sure that queries are profiled
PRAGMA profiling_output = 'profiling_output.json';
PRAGMA enable_profiling = 'json';
-- Enable logging to an in-memory table
CALL enable_logging();
-- Run some queries
CREATE TABLE small AS FROM range(1_000_000);
CREATE TABLE large AS FROM range(1_000_000_000);
PRAGMA disable_profiling;
SELECT query_id, type, metric, value::DECIMAL(15, 3) AS value
FROM duckdb_logs_parsed('Metrics')
WHERE metric = 'CPU_TIME';
You can see in the output that the first CREATE statement took about 3 milliseconds, while the second one took 3.3 seconds.
┌──────────┬─────────┬──────────┬───────────────┐
│ query_id │ type │ metric │ value │
│ uint64 │ varchar │ varchar │ decimal(15,3) │
├──────────┼─────────┼──────────┼───────────────┤
│ 8 │ Metrics │ CPU_TIME │ 0.003 │
│ 9 │ Metrics │ CPU_TIME │ 3.267 │
└──────────┴─────────┴──────────┴───────────────┘
Profiler Metrics
The profiler now supports several new metrics. These allow you the get a deeper understanding on where the execution time is spent in queries.
Performance Improvements
DuckDB v1.4.2 also ships some small performance improvements:
#19477DuckDB now buffers WAL index deletes, not only appends#19644Detaching from a database is now faster
Vortex Support
DuckDB now supports the Vortex file format through the vortex extension.
First, install and load the extension:
INSTALL vortex;
LOAD vortex;
Then, you can write Vortex files as follows:
COPY (SELECT * FROM generate_series(0, 3) t(i))
TO 'my.vortex' (FORMAT vortex);
And read them using the read_vortex function:
SELECT * FROM read_vortex('my.vortex');
┌───────┐
│ i │
│ int64 │
├───────┤
│ 0 │
│ 1 │
│ 2 │
└───────┘
Fixes
We fixed a vulnerability, several crashes, internal errors, incorrect results and regressions. We also fixed several issues discovered by our fuzzer.
Vulnerabilities
We fixed four vulnerabilities in DuckDB's database encryption:
- The DuckDB can fall back to an insecure random number generator (
pcg32) to generate cryptographic keys or IVs. - When clearing keys from memory, the compiler may remove the memset() and leave sensitive data on the heap
- By modifying the database header, an attacker could downgrade the encryption mode from GCM to CTR to bypass integrity checks.
- Failure to check return value on call to OpenSSL
rand_bytes().
See the Security Advisory for more details.
If you are using database encryption, you are strongly encouraged to update to DuckDB v1.4.2.
We would like to thank Greg Rubin and Sławek Kosowski for reporting these vulnerabilities!
Crashes and Internal Errors
#19238MERGE INTOIceberg table withTIMESTAMPTZcolumns crashes#19355Unknown expression type invalidates database#19357Expected unified vector format of typeVARCHAR, but found typeINT32#19455MERGE INTOfailed: logical operator type mismatch#19498Window function crash withpdqsort_loop#19700RLE select bug
Incorrect Results
#17757UUID Comparison in aggregation filter broken on Linux#19327Wrong result forDISTINCTandLEFT JOIN#19377Array with values shows null depending on query
Regressions
Storage
#19424Fix issue in MetadataManager triggered when doing concurrent reads while checkpointing#19527Ensure that DuckDB outputs the expectedSTORAGE_VERSION#19543Error when settingforce_compression = 'zstd'in an in-memory environment database