Search Shortcut cmd + k | ctrl + k
Search cmd+k ctrl+k
Dark Mode
1.1 (stable)
Friendly SQL

DuckDB offers several advanced SQL features and syntactic sugar to make SQL queries more concise. We refer to these colloquially as “friendly SQL”.

Several of these features are also supported in other systems while some are (currently) exclusive to DuckDB.

Clauses

  • Creating tables and inserting data:
  • Describing tables and computing statistics:
    • DESCRIBE: this clause provides a succinct summary of the schema of a table or query.
    • SUMMARIZE: this clause returns summary statistics for a table or query.
  • Making SQL clauses more compact:
    • FROM-first syntax with an optional SELECT clause: DuckDB allows queries in the form of FROM tbl which selects all columns (performing a SELECT * statement).
    • GROUP BY ALL: this clause allows omitting the group-by columns by inferring them from the list of attributes in the SELECT clause.
    • ORDER BY ALL: this clause allows ordering on all columns (e.g., to ensure deterministic results).
    • SELECT * EXCLUDE: the EXCLUDE option allows excluding specific columns from the * expression.
    • SELECT * REPLACE: the REPLACE option allows replacing specific columns with different expressions in a * expression.
    • UNION BY NAME: this clause performing the UNION operation along the names of columns (instead of relying on positions).
  • Transforming tables:
    • PIVOT to turn long tables to wide tables.
    • UNPIVOT to turn wide tables to long tables.
  • Defining SQL-level variables:

Query Features

Literals and Identifiers

Data Types

Data Import

Functions and Expressions

Join Types

Trailing Commas

DuckDB allows trailing commas, both when listing entities (e.g., column and table names) and when constructing LIST items. For example, the following query works:

SELECT
    42 AS x,
    ['a', 'b', 'c',] AS y,
    'hello world' AS z,
;

"Top-N in Group" Queries

Computing the "top-N rows in a group" ordered by some criteria is a common task in SQL that unfortunately often requires a complex query involving window functions and/or subqueries.

To aid in this, DuckDB provides the aggregate functions max(arg, n), min(arg, n), arg_max(arg, val, n), arg_min(arg, val, n), max_by(arg, val, n) and min_by(arg, val, n) to efficiently return the "top" n rows in a group based on a specific column in either ascending or descending order.

For example, let's use the following table:

SELECT * FROM t1;
┌─────────┬───────┐
│   grp   │  val  │
│ varchar │ int32 │
├─────────┼───────┤
│ a       │     2 │
│ a       │     1 │
│ b       │     5 │
│ b       │     4 │
│ a       │     3 │
│ b       │     6 │
└─────────┴───────┘

We want to get a list of the top-3 val values in each group grp. The conventional way to do this is to use a window function in a subquery:

SELECT array_agg(rs.val), rs.grp 
FROM (
    SELECT val, grp, row_number() OVER (PARTITION BY grp ORDER BY val DESC) as rid 
    FROM t1 ORDER BY val DESC
) AS rs 
WHERE rid < 4 
GROUP BY rs.grp;
┌───────────────────┬─────────┐
│ array_agg(rs.val) │   grp   │
│      int32[]      │ varchar │
├───────────────────┼─────────┤
│ [3, 2, 1]         │ a       │
│ [6, 5, 4]         │ b       │
└───────────────────┴─────────┘

But in DuckDB, we can do this much more concisely (and efficiently!):

SELECT max(val, 3) FROM t1 GROUP BY grp;
┌─────────────┐
│ max(val, 3) │
│   int32[]   │
├─────────────┤
│ [3, 2, 1]   │
│ [6, 5, 4]   │
└─────────────┘