Join the DuckDB Discord server!

Join now Join DuckDB on Discord

DuckDB is an in-process
SQL OLAP database management system

Why DuckDB?

Simple

  • In-process, serverless
  • C++11, no dependencies, single file build
  • APIs for Python/R/Java/…
more

Feature-rich

  • Transactions, persistence
  • Extensive SQL support
  • Direct Parquet & CSV querying
more

Fast

  • Vectorized engine
  • Optimized for analytics
  • Parallel query processing
more

Free

  • Free & Open Source
  • Permissive MIT License
more

All the benefits of a database, none of the hassle.

When to use DuckDB

  • Processing and storing tabular datasets, e.g. from CSV or Parquet files
  • Interactive data analysis, e.g. Joining & aggregate multiple large tables
  • Concurrent large changes, to multiple large tables, e.g. appending rows, adding/removing/updating columns
  • Large result set transfer to client

When to not use DuckDB

  • High-volume transactional use cases (e.g. tracking orders in a webshop)
  • Large client/server installations for centralized enterprise data warehousing
  • Writing to a single database from multiple concurrent processes

Blog

Archive
2022-11-25

DuckCon 2023 - 2nd edition

The DuckDB team is excited to invite you all for our second DuckCon user group meeting. It will take place the day before FOSDEM in Brussels on Feb 3rd, 2023, at the Hilton Hotel. In this edition, we will have the DuckDB creators Hannes Mühleisen, and Mark Raasveldt, talking about […]

continue reading
2022-11-14

Announcing DuckDB 0.6.0

The DuckDB team is happy to announce the latest DuckDB version (0.6.0) has been released. This release of DuckDB is named “Oxyura” after the White-headed duck (Oxyura leucocephala) which is an endangered species native to Eurasia. To install the new version, please visit the installation guide. Note that the release […]

continue reading
2022-10-28

Lightweight Compression in DuckDB

TLDR: DuckDB supports efficient lightweight compression that is automatically used to keep data size down without incurring high costs for compression and decompression. When working with large amounts of data, compression is critical for reducing storage size and egress costs. Compression algorithms typically reduce data set size by 75-95%, depending […]

continue reading