2024-09-09The DuckDB team
The DuckDB team is happy to announce that today we're releasing DuckDB version 1.1.0, codenamed “Eatoni”.
continue reading
2024-08-19Gabor Szarnyas
We use a simple example data set to present a few tricks that are useful when using DuckDB.
continue reading
2024-08-15Mark Raasveldt, Hannes Mühleisen, Gabor Szarnyas, Kelly de Smit
continue reading
2024-08-08Tania Bogatsch and Maia de Graaf
continue reading
2024-07-09Mark Raasveldt
continue reading
2024-07-05The DuckDB team
DuckDB extensions can now be published via the [DuckDB Community Extensions repository](https://github.com/duckdb/community-extensions). The repository makes it easier for users to install extensions using the `INSTALL ⟨extension name⟩ FROM community` syntax. Extension developers avoid the burdens of compilation and distribution.
continue reading
2024-06-26Alex Monahan
In the last 3 years, DuckDB has become 3-25x faster and can analyze ~10x larger datasets all on the same hardware.
continue reading
2024-06-22The DuckDB Team
continue reading
2024-06-20Gabor Szarnyas
DuckDB's CLI client is portable to many platforms and architectures. It handles CSV files conveniently and offers users the same rich SQL syntax everywhere. These characteristics make DuckDB an ideal tool to complement traditional Unix tools for data processing in the command line.
continue reading
2024-06-10Sam Ansmink
DuckDB now has native support for [Delta Lake](https://delta.io/), an open-source lakehouse framework, with the [Delta](/docs/extensions/delta) extension.
continue reading
2024-06-03Mark Raasveldt and Hannes Mühleisen
The DuckDB team is
very happy to announce that today we’re releasing DuckDB version 1.0.0, codename “Snow Duck” (anas nivis).
continue reading
2024-05-31Gabor Szarnyas
We use a real-world railway dataset to demonstrate some of DuckDB's key features, including querying different file formats, connecting to remote endpoints, and using advanced SQL features.
continue reading
2024-05-29The Hugging Face and DuckDB teams
DuckDB can now read data from [Hugging Face](https://huggingface.co/) via the `hf://` prefix.
continue reading
2024-05-03Max Gabrielsson
This blog post shows a preview of DuckDB's new [`vss` extension](/docs/extensions/vss), which introduces support for HNSW (Hierarchical Navigable Small Worlds) indexes to accelerate vector similarity search.
continue reading
2024-04-02Hannes Mühleisen
The new R package duckplyr translates the dplyr API to DuckDB's execution engine.
continue reading
2024-03-29Laurens Kuiper
Since the 0.9.0 release, DuckDB’s fully parallel aggregate hash table can efficiently aggregate over many more groups than fit in memory.
continue reading
2024-03-26Hannes Mühleisen
A 42 kB Parquet file can contain over 4 PB of data.
continue reading
2024-03-22Sam Ansmink
While core DuckDB has zero external dependencies, building extensions with dependencies is now very simple, with built-in support for vcpkg, an open-source package manager with support for over 2000 C/C++ packages. Interested in building your own? Check out the [extension template](https://github.com/duckdb/extension-template).
continue reading
2024-03-01Alex Monahan
Combining multiple features of DuckDB’s [friendly SQL](/docs/guides/sql_features/friendly_sql) allows for highly flexible queries that can be reused across tables.
continue reading
2024-02-13Mark Raasveldt and Hannes Mühleisen
The DuckDB team is happy to announce the latest DuckDB release (0.10.0). This release is named Fusca after the [Velvet scoter](https://en.wikipedia.org/wiki/Velvet_scoter) native to Europe.
continue reading
2024-01-26Mark Raasveldt
DuckDB can attach MySQL, Postgres, and SQLite databases in addition to databases stored in its own format. This allows data to be read into DuckDB and moved between these systems in a convenient manner.
continue reading
2023-12-18Carlo Piovesan
DuckDB-Wasm users can now load DuckDB extensions, allowing them to run extensions in the browser.
continue reading
2023-11-03Tom Ebergen
The H2O.ai db-benchmark has been updated with new results. In addition, the AWS EC2 instance used for benchmarking has been changed to a c6id.metal for improved repeatability and fairness across libraries. DuckDB is the fastest library for both join and group by queries at almost every data size.
continue reading
2023-10-27Pedro Holanda
DuckDB is primarily focused on performance, leveraging the capabilities of modern file formats. At the same time, we also pay attention to flexible, non-performance-driven formats like CSV files. To create a nice and pleasant experience when reading from CSV files, DuckDB implements a CSV sniffer that automatically detects CSV dialect options, column types, and even skips dirty data. The sniffing process allows users to efficiently explore CSV files without needing to provide any input about the file format.
continue reading
2023-10-06Mark Raasveldt, Hannes Mühleisen, Gabor Szarnyas
continue reading
2023-09-26Mark Raasveldt and Hannes Mühleisen
continue reading
2023-09-15Richard Wesley
DuckDB supports AsOf Joins – a way to match nearby values. They are especially useful for searching event tables for temporal analytics.
continue reading
2023-08-23Alex Monahan
DuckDB continues to push the boundaries of SQL syntax to both simplify queries and make more advanced analyses possible. Highlights include dynamic column selection, queries that start with the FROM clause, function chaining, and list comprehensions. We boldly go where no SQL engine has gone before! For more details, see the documentation for [friendly SQL features](/docs/guides/sql_features/friendly_sql).
continue reading
2023-08-04Pedro Holanda
DuckDB has added support for [Arrow Database Connectivity (ADBC)](https://arrow.apache.org/adbc/0.5.1/index.html), an API standard that enables efficient data ingestion and retrieval from database systems, similar to [Open Database Connectivity (ODBC)](https://learn.microsoft.com/en-us/sql/odbc/microsoft-open-database-connectivity-odbc?view=sql-server-ver16) interface. However, unlike ODBC, ADBC specifically caters to the columnar storage model, facilitating fast data transfers between a columnar database and an external application.
continue reading
2023-07-07Pedro Holanda, Thijs Bruineman and Phillip Cloud
DuckDB now supports vectorized Scalar Python User Defined Functions (UDFs). By implementing Python UDFs, users can easily expand the functionality of DuckDB while taking advantage of DuckDB's fast execution model, SQL and data safety.
continue reading
2023-05-26Mark Raasveldt
continue reading
2023-05-17Mark Raasveldt and Hannes Mühleisen
continue reading
2023-05-12Mark Raasveldt and Hannes Mühleisen
continue reading
2023-04-28Max Gabrielsson
DuckDB now has an official [Spatial extension](https://github.com/duckdb/duckdb_spatial) to enable geospatial processing.
continue reading
2023-04-28Hannes Mühleisen
continue reading
2023-04-21Tristan Celder
DuckDB now has a native Swift API. DuckDB on mobile here we go!
continue reading
2023-04-14Tom Ebergen
We've resurrected the H2O.ai database-like ops benchmark with up to date libraries and plan to keep re-running it.
continue reading
2023-03-03Laurens Kuiper
We've recently improved DuckDB's JSON extension so JSON files can be directly queried as if they were tables.
continue reading
2023-02-24Guest post by Eduardo Blancas
[JupySQL](https://github.com/ploomber/jupysql) provides a seamless SQL experience in Jupyter and uses DuckDB to visualize larger than memory datasets in matplotlib.
continue reading
2023-02-13Mark Raasveldt
continue reading
2022-11-25Pedro Holanda
continue reading
2022-11-14Mark Raasveldt
continue reading
2022-10-28Mark Raasveldt
DuckDB supports efficient lightweight compression that is automatically used to keep data size down without incurring high costs for compression and decompression.
continue reading
2022-10-12Guest post by Jacob Matson
A fast, free, and open-source Modern Data Stack (MDS) can now be fully deployed on your laptop or to a single machine using the combination of [DuckDB](https://duckdb.org/), [Meltano](https://meltano.com/), [dbt](https://www.getdbt.com/), and [Apache Superset](https://superset.apache.org/).
continue reading
2022-09-30Hannes Mühleisen
DuckDB can now directly query tables stored in PostgreSQL and speed up complex analytical queries without duplicating data.
continue reading
2022-07-27Pedro Holanda
DuckDB uses Adaptive Radix Tree (ART) Indexes to enforce constraints and to speed up query filters. Up to this point, indexes were not persisted, causing issues like loss of indexing information and high reload times for tables with data constraints. We now persist ART Indexes to disk, drastically diminishing database loading times (up to orders of magnitude), and we no longer lose track of existing indexes. This blog post contains a deep dive into the implementation of ART storage, benchmarks, and future work. Finally, to better understand how our indexes are used, I'm asking you to answer the following [survey](https://forms.gle/eSboTEp9qpP7ybz98). It will guide us when defining our future roadmap.
continue reading
2022-05-27Richard Wesley
DuckDB has fully parallelised range joins that can efficiently join millions of range predicates.
continue reading
2022-05-04Alex Monahan
DuckDB offers several extensions to the SQL syntax. For a full list of these features, see the [Friendly SQL documentation page](/docs/guides/sql_features/friendly_sql).
continue reading
2022-03-07Hannes Mühleisen and Mark Raasveldt
DuckDB has a fully parallelized aggregate hash table that can efficiently aggregate over millions of groups.
continue reading
2022-01-06Richard Wesley
The DuckDB ICU extension now provides time zone support.
continue reading
2021-12-03Pedro Holanda and Jonathan Keane
The zero-copy integration between DuckDB and Apache Arrow allows for rapid analysis of larger than memory datasets in Python and R using either SQL or relational APIs.
continue reading
2021-11-26Pedro Holanda
continue reading
2021-11-12Richard Wesley
DuckDB, a free and Open-Source analytical data management system, has a windowing API that can compute complex moving aggregates like interquartile ranges and median absolute deviation much faster than the conventional approaches.
continue reading
2021-10-29André Kohn and Dominik Moritz
[DuckDB-Wasm](https://github.com/duckdb/duckdb-wasm) is an in-process analytical SQL database for the browser. It is powered by WebAssembly, speaks Arrow fluently, reads Parquet, CSV and JSON files backed by Filesystem APIs or HTTP requests and has been tested with Chrome, Firefox, Safari and Node.js. You can try it in your browser at [shell.duckdb.org](https://shell.duckdb.org) or on [Observable](https://observablehq.com/@cmudig/duckdb).
continue reading
2021-10-13Richard Wesley
DuckDB, a free and Open-Source analytical data management system, has a state-of-the-art windowing engine that can compute complex moving aggregates like inter-quartile ranges as well as simpler moving averages.
continue reading
2021-08-27Laurens Kuiper
DuckDB, a free and Open-Source analytical data management system, has a new highly efficient parallel sorting implementation that can sort much more data than fits in main memory.
continue reading
2021-06-25Hannes Mühleisen and Mark Raasveldt
DuckDB, a free and open source analytical data management system, can run SQL queries directly on Parquet files and automatically take advantage of the advanced features of the Parquet format.
continue reading
2021-05-14Mark Raasveldt and Hannes Mühleisen
DuckDB, a free and open source analytical data management system, can efficiently run SQL queries directly on Pandas DataFrames.
continue reading
2021-01-25Laurens Kuiper
DuckDB now has full-text search functionality, similar to the FTS5 extension in SQLite. The main difference is that our FTS extension is fully formulated in SQL. We tested it out on TREC disks 4 and 5.
continue reading