Join the DuckDB Discord server!

Join now Join DuckDB on Discord

DuckDB is an in-process
SQL OLAP database management system

Why DuckDB?

Simple

  • In-process, serverless
  • C++11, no dependencies, single file build
  • APIs for Python/R/Java/…
more

Feature-rich

  • Transactions, persistence
  • Extensive SQL support
  • Direct Parquet & CSV querying
more

Fast

  • Vectorized engine
  • Optimized for analytics
  • Parallel query processing
more

Free

  • Free & Open Source
  • Permissive MIT License
more

All the benefits of a database, none of the hassle.

When to use DuckDB

  • Processing and storing tabular datasets, e.g. from CSV or Parquet files
  • Interactive data analysis, e.g. Joining & aggregate multiple large tables
  • Concurrent large changes, to multiple large tables, e.g. appending rows, adding/removing/updating columns
  • Large result set transfer to client

When to not use DuckDB

  • High-volume transactional use cases (e.g. tracking orders in a webshop)
  • Large client/server installations for centralized enterprise data warehousing
  • Writing to a single database from multiple concurrent processes
  • Multiple concurrent processes reading from a single writable database

Blog

Archive
2023-03-03

Shredding Deeply Nested JSON, One Vector at a Time

TL;DR: We’ve recently improved DuckDB’s JSON extension so JSON files can be directly queried as if they were tables. DuckDB has a JSON extension that can be installed and loaded through SQL: INSTALL 'json'; LOAD 'json'; The JSON extension supports various functions to create, read, and manipulate JSON strings. These […]

continue reading
2023-02-24

JupySQL Plotting with DuckDB

TLDR JupySQL provides a seamless SQL experience in Jupyter and uses DuckDB to visualize larger than memory datasets in matplotlib. Introduction Data visualization is essential for every data practitioner since it allows us to find patterns that otherwise would be hard to see. The typical approach for plotting tabular datasets […]

continue reading
2023-02-13

Announcing DuckDB 0.7.0

The DuckDB team is happy to announce the latest DuckDB version (0.7.0) has been released. This release of DuckDB is named “Labradorius” after the Labrador Duck (Camptorhynchus labradorius) that was native to North America. To install the new version, please visit the installation guide. The full release notes can be […]

continue reading