Documentation
/ SQL
/ Functions
Nested Functions
This section describes functions and operators for examining and manipulating nested values. There are five nested data types:
ARRAY
, LIST
, MAP
, STRUCT
, and UNION
.
Name |
Description |
list[index] |
Bracket notation serves as an alias for list_extract . |
list[begin:end] |
Bracket notation with colon is an alias for list_slice . |
list[begin:end:step] |
list_slice in bracket notation with an added step feature. |
array_pop_back(list) |
Returns the list without the last element. |
array_pop_front(list) |
Returns the list without the first element. |
flatten(list_of_lists) |
Concatenate a list of lists into a single list. This only flattens one level of the list (see examples). |
len(list) |
Return the length of the list. |
list_aggregate(list, name) |
Executes the aggregate function name on the elements of list . See the List Aggregates section for more details. |
list_any_value(list) |
Returns the first non-null value in the list. |
list_append(list, element) |
Appends element to list . |
list_concat(list1, list2) |
Concatenates two lists. |
list_contains(list, element) |
Returns true if the list contains the element. |
list_cosine_similarity(list1, list2) |
Compute the cosine similarity between two lists. |
list_distance(list1, list2) |
Calculates the Euclidean distance between two points with coordinates given in two inputs lists of equal length. |
list_distinct(list) |
Removes all duplicates and NULL values from a list. Does not preserve the original order. |
list_dot_product(list1, list2) |
Computes the dot product of two same-sized lists of numbers. |
list_extract(list, index) |
Extract the index th (1-based) value from the list. |
list_filter(list, lambda) |
Constructs a list from those elements of the input list for which the lambda function returns true. See the Lambda Functions page for more details. |
list_grade_up(list) |
Works like sort, but the results are the indexes that correspond to the position in the original list instead of the actual values. |
list_has_all(list, sub-list) |
Returns true if all elements of sub-list exist in list. |
list_has_any(list1, list2) |
Returns true if any elements exist is both lists. |
list_intersect(list1, list2) |
Returns a list of all the elements that exist in both l1 and l2 , without duplicates. |
list_position(list, element) |
Returns the index of the element if the list contains the element. |
list_prepend(element, list) |
Prepends element to list . |
list_reduce(list, lambda) |
Returns a single value that is the result of applying the lambda function to each element of the input list. See the Lambda Functions page for more details. |
list_resize(list, size[, value]) |
Resizes the list to contain size elements. Initializes new elements with value or NULL if value is not set. |
list_reverse_sort(list) |
Sorts the elements of the list in reverse order. See the Sorting Lists section for more details about the NULL sorting order. |
list_reverse(list) |
Reverses the list. |
list_select(value_list, index_list) |
Returns a list based on the elements selected by the index_list . |
list_slice(list, begin, end, step) |
list_slice with added step feature. |
list_slice(list, begin, end) |
Extract a sublist using slice conventions. Negative values are accepted. See slicing. |
list_sort(list) |
Sorts the elements of the list. See the Sorting Lists section for more details about the sorting order and the NULL sorting order. |
list_transform(list, lambda) |
Returns a list that is the result of applying the lambda function to each element of the input list. See the Lambda Functions page for more details. |
list_unique(list) |
Counts the unique elements of a list. |
list_value(any, ...) |
Create a LIST containing the argument values. |
list_where(value_list, mask_list) |
Returns a list with the BOOLEAN s in mask_list applied as a mask to the value_list . |
list_zip(list1, list2, ...) |
Zips k LIST s to a new LIST whose length will be that of the longest list. Its elements are structs of k elements list_1 , …, list_k . Elements missing will be replaced with NULL . |
unnest(list) |
Unnests a list by one level. Note that this is a special function that alters the cardinality of the result. See the unnest page for more details. |
Description |
Bracket notation serves as an alias for list_extract . |
Example |
[4, 5, 6][3] |
Result |
6 |
Alias |
list_extract |
Description |
Bracket notation with colon is an alias for list_slice . |
Example |
[4, 5, 6][2:3] |
Result |
[5, 6] |
Alias |
list_slice |
Description |
list_slice in bracket notation with an added step feature. |
Example |
[4, 5, 6][:-:2] |
Result |
[4, 6] |
Alias |
list_slice |
Description |
Returns the list without the last element. |
Example |
array_pop_back(l) |
Result |
[4, 5] |
Description |
Returns the list without the first element. |
Example |
array_pop_front(l) |
Result |
[5, 6] |
Description |
Concatenate a list of lists into a single list. This only flattens one level of the list (see examples). |
Example |
flatten([[1, 2], [3, 4]]) |
Result |
[1, 2, 3, 4] |
Description |
Return the length of the list. |
Example |
len([1, 2, 3]) |
Result |
3 |
Alias |
array_length |
Description |
Executes the aggregate function name on the elements of list . See the List Aggregates section for more details. |
Example |
list_aggregate([1, 2, NULL], 'min') |
Result |
1 |
Aliases |
list_aggr , aggregate , array_aggregate , array_aggr |
Description |
Returns the first non-null value in the list. |
Example |
list_any_value([NULL, -3]) |
Result |
-3 |
Description |
Appends element to list . |
Example |
list_append([2, 3], 4) |
Result |
[2, 3, 4] |
Aliases |
array_append , array_push_back |
Description |
Concatenates two lists. |
Example |
list_concat([2, 3], [4, 5, 6]) |
Result |
[2, 3, 4, 5, 6] |
Aliases |
list_cat , array_concat , array_cat |
Description |
Returns true if the list contains the element. |
Example |
list_contains([1, 2, NULL], 1) |
Result |
true |
Aliases |
list_has , array_contains , array_has |
Description |
Compute the cosine similarity between two lists. |
Example |
list_cosine_similarity([1, 2, 3], [1, 2, 5]) |
Result |
0.9759000729485332 |
Description |
Calculates the Euclidean distance between two points with coordinates given in two inputs lists of equal length. |
Example |
list_distance([1, 2, 3], [1, 2, 5]) |
Result |
2.0 |
Description |
Removes all duplicates and NULL values from a list. Does not preserve the original order. |
Example |
list_distinct([1, 1, NULL, -3, 1, 5]) |
Result |
[1, 5, -3] |
Alias |
array_distinct |
Description |
Computes the dot product of two same-sized lists of numbers. |
Example |
list_dot_product([1, 2, 3], [1, 2, 5]) |
Result |
20.0 |
Alias |
list_inner_product |
Description |
Extract the index th (1-based) value from the list. |
Example |
list_extract([4, 5, 6], 3) |
Result |
6 |
Aliases |
list_element , array_extract |
Description |
Constructs a list from those elements of the input list for which the lambda function returns true. See the Lambda Functions page for more details. |
Example |
list_filter([4, 5, 6], x -> x > 4) |
Result |
[5, 6] |
Aliases |
array_filter , filter |
Description |
Works like sort, but the results are the indexes that correspond to the position in the original list instead of the actual values. |
Example |
list_grade_up([30, 10, 40, 20]) |
Result |
[2, 4, 1, 3] |
Alias |
array_grade_up |
Description |
Returns true if all elements of sub-list exist in list. |
Example |
list_has_all([4, 5, 6], [4, 6]) |
Result |
true |
Alias |
array_has_all |
Description |
Returns true if any elements exist is both lists. |
Example |
list_has_any([1, 2, 3], [2, 3, 4]) |
Result |
true |
Alias |
array_has_any |
Description |
Returns a list of all the elements that exist in both l1 and l2 , without duplicates. |
Example |
list_intersect([1, 2, 3], [2, 3, 4]) |
Result |
[2, 3] |
Alias |
array_intersect |
Description |
Returns the index of the element if the list contains the element. |
Example |
list_contains([1, 2, NULL], 2) |
Result |
2 |
Aliases |
list_indexof , array_position , array_indexof |
Description |
Prepends element to list . |
Example |
list_prepend(3, [4, 5, 6]) |
Result |
[3, 4, 5, 6] |
Aliases |
array_prepend , array_push_front |
Description |
Returns a single value that is the result of applying the lambda function to each element of the input list. See the Lambda Functions page for more details. |
Example |
list_reduce([4, 5, 6], (x, y) -> x + y) |
Result |
15 |
Aliases |
array_reduce , reduce |
Description |
Resizes the list to contain size elements. Initializes new elements with value or NULL if value is not set. |
Example |
list_resize([1, 2, 3], 5, 0) |
Result |
[1, 2, 3, 0, 0] |
Alias |
array_resize |
Description |
Sorts the elements of the list in reverse order. See the Sorting Lists section for more details about the NULL sorting order. |
Example |
list_reverse_sort([3, 6, 1, 2]) |
Result |
[6, 3, 2, 1] |
Alias |
array_reverse_sort |
Description |
Reverses the list. |
Example |
list_reverse([3, 6, 1, 2]) |
Result |
[2, 1, 6, 3] |
Alias |
array_reverse |
Description |
Returns a list based on the elements selected by the index_list . |
Example |
list_select([10, 20, 30, 40], [1, 4]) |
Result |
[10, 40] |
Alias |
array_select |
Description |
list_slice with added step feature. |
Example |
list_slice([4, 5, 6], 1, 3, 2) |
Result |
[4, 6] |
Alias |
array_slice |
Description |
Extract a sublist using slice conventions. Negative values are accepted. See slicing. |
Example |
list_slice([4, 5, 6], 2, 3) |
Result |
[5, 6] |
Alias |
array_slice |
Description |
Sorts the elements of the list. See the Sorting Lists section for more details about the sorting order and the NULL sorting order. |
Example |
list_sort([3, 6, 1, 2]) |
Result |
[1, 2, 3, 6] |
Alias |
array_sort |
Description |
Returns a list that is the result of applying the lambda function to each element of the input list. See the Lambda Functions page for more details. |
Example |
list_transform([4, 5, 6], x -> x + 1) |
Result |
[5, 6, 7] |
Aliases |
array_transform , apply , list_apply , array_apply |
Description |
Counts the unique elements of a list. |
Example |
list_unique([1, 1, NULL, -3, 1, 5]) |
Result |
3 |
Alias |
array_unique |
Description |
Create a LIST containing the argument values. |
Example |
list_value(4, 5, 6) |
Result |
[4, 5, 6] |
Alias |
list_pack |
Description |
Returns a list with the BOOLEAN s in mask_list applied as a mask to the value_list . |
Example |
list_where([10, 20, 30, 40], [true, false, false, true]) |
Result |
[10, 40] |
Alias |
array_where |
Description |
Zips k LIST s to a new LIST whose length will be that of the longest list. Its elements are structs of k elements list_1 , …, list_k . Elements missing will be replaced with NULL . |
Example |
list_zip([1, 2], [3, 4], [5, 6]) |
Result |
[{'list_1': 1, 'list_2': 3, 'list_3': 5}, {'list_1': 2, 'list_2': 4, 'list_3': 6}] |
Alias |
array_zip |
Description |
Unnests a list by one level. Note that this is a special function that alters the cardinality of the result. See the unnest page for more details. |
Example |
unnest([1, 2, 3]) |
Result |
1 , 2 , 3 |
The following operators are supported for lists:
Operator |
Description |
Example |
Result |
&& |
Alias for list_intersect . |
[1, 2, 3, 4, 5] && [2, 5, 5, 6] |
[2, 5] |
@> |
Alias for list_has_all , where the list on the right of the operator is the sublist. |
[1, 2, 3, 4] @> [3, 4, 3] |
true |
<@ |
Alias for list_has_all , where the list on the left of the operator is the sublist. |
[1, 4] <@ [1, 2, 3, 4] |
true |
|| |
Alias for list_concat . |
[1, 2, 3] || [4, 5, 6] |
[1, 2, 3, 4, 5, 6] |
<=> |
Alias for list_cosine_similarity . |
[1, 2, 3] <=> [1, 2, 5] |
0.9759000729485332 |
<-> |
Alias for list_distance . |
[1, 2, 3] <-> [1, 2, 5] |
2.0 |
Python-style list comprehension can be used to compute expressions over elements in a list. For example:
SELECT [lower(x) FOR x IN strings]
FROM (VALUES (['Hello', '', 'World'])) t(strings);
SELECT [upper(x) FOR x IN strings IF len(x) > 0]
FROM (VALUES (['Hello', '', 'World'])) t(strings);
Description |
Dot notation that serves as an alias for struct_extract from named STRUCT s. |
Example |
({'i': 3, 's': 'string'}).i |
Result |
3 |
Description |
Bracket notation that serves as an alias for struct_extract from named STRUCT s. |
Example |
({'i': 3, 's': 'string'})['i'] |
Result |
3 |
Description |
Bracket notation that serves as an alias for struct_extract from unnamed STRUCT s (tuples), using an index (1-based). |
Example |
(row(42, 84))[1] |
Result |
42 |
Description |
Create an unnamed STRUCT (tuple) containing the argument values. |
Example |
row(i, i % 4, i / 4) |
Result |
(10, 2, 2.5) |
Description |
Extract the named entry from the STRUCT . |
Example |
struct_extract({'i': 3, 'v2': 3, 'v3': 0}, 'i') |
Result |
3 |
Description |
Extract the entry from an unnamed STRUCT (tuple) using an index (1-based). |
Example |
struct_extract(row(42, 84), 1) |
Result |
42 |
Description |
Add field(s)/value(s) to an existing STRUCT with the argument values. The entry name(s) will be the bound variable name(s). |
Example |
struct_insert({'a': 1}, b := 2) |
Result |
{'a': 1, 'b': 2} |
Description |
Create a STRUCT containing the argument values. The entry name will be the bound variable name. |
Example |
struct_pack(i := 4, s := 'string') |
Result |
{'i': 4, 's': string} |
Name |
Description |
cardinality(map) |
Return the size of the map (or the number of entries in the map). |
element_at(map, key) |
Return a list containing the value for a given key or an empty list if the key is not contained in the map. The type of the key provided in the second parameter must match the type of the map’s keys else an error is returned. |
map_entries(map) |
Return a list of struct(k, v) for each key-value pair in the map. |
map_extract(map, key) |
Alias of element_at . Return a list containing the value for a given key or an empty list if the key is not contained in the map. The type of the key provided in the second parameter must match the type of the map’s keys else an error is returned. |
map_from_entries(STRUCT(k, v)[]) |
Returns a map created from the entries of the array. |
map_keys(map) |
Return a list of all keys in the map. |
map_values(map) |
Return a list of all values in the map. |
map() |
Returns an empty map. |
map[entry] |
Alias for element_at . |
Description |
Return the size of the map (or the number of entries in the map). |
Example |
cardinality(map([4, 2], ['a', 'b'])) |
Result |
2 |
Description |
Return a list containing the value for a given key or an empty list if the key is not contained in the map. The type of the key provided in the second parameter must match the type of the map’s keys else an error is returned. |
Example |
element_at(map([100, 5], [42, 43]), 100) |
Result |
[42] |
Description |
Return a list of struct(k, v) for each key-value pair in the map. |
Example |
map_entries(map([100, 5], [42, 43])) |
Result |
[{'key': 100, 'value': 42}, {'key': 5, 'value': 43}] |
Description |
Alias of element_at . Return a list containing the value for a given key or an empty list if the key is not contained in the map. The type of the key provided in the second parameter must match the type of the map’s keys else an error is returned. |
Example |
map_extract(map([100, 5], [42, 43]), 100) |
Result |
[42] |
Description |
Returns a map created from the entries of the array. |
Example |
map_from_entries([{k: 5, v: 'val1'}, {k: 3, v: 'val2'}]) |
Result |
{5=val1, 3=val2} |
Description |
Return a list of all keys in the map. |
Example |
map_keys(map([100, 5], [42,43])) |
Result |
[100, 5] |
Description |
Return a list of all values in the map. |
Example |
map_values(map([100, 5], [42, 43])) |
Result |
[42, 43] |
Description |
Returns an empty map. |
Example |
map() |
Result |
{} |
Description |
Alias for element_at . |
Example |
map([100, 5], ['a', 'b'])[100] |
Result |
[a] |
Name |
Description |
union.tag |
Dot notation serves as an alias for union_extract . |
union_extract(union, 'tag') |
Extract the value with the named tags from the union. NULL if the tag is not currently selected. |
union_value(tag := any) |
Create a single member UNION containing the argument value. The tag of the value will be the bound variable name. |
union_tag(union) |
Retrieve the currently selected tag of the union as an Enum. |
Description |
Dot notation serves as an alias for union_extract . |
Example |
(union_value(k := 'hello')).k |
Result |
string |
Description |
Extract the value with the named tags from the union. NULL if the tag is not currently selected. |
Example |
union_extract(s, 'k') |
Result |
hello |
Description |
Create a single member UNION containing the argument value. The tag of the value will be the bound variable name. |
Example |
union_value(k := 'hello') |
Result |
'hello'::UNION(k VARCHAR) |
Description |
Retrieve the currently selected tag of the union as an Enum. |
Example |
union_tag(union_value(k := 'foo')) |
Result |
'k' |
DuckDB offers two range functions, range(start, stop, step)
and generate_series(start, stop, step)
, and their variants with default arguments for stop
and step
. The two functions’ behavior differens regarding their stop
argument. This is documented below.
The range
function creates a list of values in the range between start
and stop
.
The start
parameter is inclusive, while the stop
parameter is exclusive.
The default value of start
is 0 and the default value of step
is 1.
Based on the number of arguments, the following variants of range
exist.
The generate_series
function creates a list of values in the range between start
and stop
.
Both the start
and the stop
parameters are inclusive.
The default value of start
is 0 and the default value of step
is 1.
Based on the number of arguments, the following variants of generate_series
exist.
SELECT generate_series(5);
SELECT generate_series(2, 5);
SELECT generate_series(2, 5, 3);
Date ranges are also supported for TIMESTAMP
and TIMESTAMP WITH TIME ZONE
values.
Note that for these types, the stop
and step
arguments have to be specified explicitly (a default value is not provided).
SELECT *
FROM range(DATE '1992-01-01', DATE '1992-03-01', INTERVAL '1' MONTH);
range |
1992-01-01 00:00:00 |
1992-02-01 00:00:00 |
SELECT *
FROM generate_series(DATE '1992-01-01', DATE '1992-03-01', INTERVAL '1' MONTH);
generate_series |
1992-01-01 00:00:00 |
1992-02-01 00:00:00 |
1992-03-01 00:00:00 |
The function list_slice
can be used to extract a sublist from a list. The following variants exist:
list_slice(list, begin, end)
list_slice(list, begin, end, step)
array_slice(list, begin, end)
array_slice(list, begin, end, step)
list[begin:end]
list[begin:end:step]
The arguments are as follows:
list
begin
- Is the index of the first element to be included in the slice
- When
begin < 0
the index is counted from the end of the list
- When
begin < 0
and -begin > length
, begin
is clamped to the beginning of the list
- When
begin > length
, the result is an empty list
- Bracket Notation: When
begin
is omitted, it defaults to the beginning of the list
end
- Is the index of the last element to be included in the slice
- When
end < 0
the index is counted from the end of the list
- When
end > length
, end is clamped to length
- When
end < begin
, the result is an empty list
- Bracket Notation: When
end
is omitted, it defaults to the end of the list. When end
is omitted and a step
is provided, end
must be replaced with a -
step
(optional)
- Is the step size between elements in the slice
- When
step < 0
the slice is reversed, and begin
and end
are swapped
- Must be non-zero
Examples:
SELECT list_slice([1, 2, 3, 4, 5], 2, 4);
SELECT ([1, 2, 3, 4, 5])[2:4:2];
SELECT([1, 2, 3, 4, 5])[4:2:-2];
SELECT ([1, 2, 3, 4, 5])[:];
SELECT ([1, 2, 3, 4, 5])[:-:2];
SELECT ([1, 2, 3, 4, 5])[:-:-2];
The function list_aggregate
allows the execution of arbitrary existing aggregate functions on the elements of a list. Its first argument is the list (column), its second argument is the aggregate function name, e.g., min
, histogram
or sum
.
list_aggregate
accepts additional arguments after the aggregate function name. These extra arguments are passed directly to the aggregate function, which serves as the second argument of list_aggregate
.
SELECT list_aggregate([1, 2, -4, NULL], 'min');
SELECT list_aggregate([2, 4, 8, 42], 'sum');
SELECT list_aggregate([[1, 2], [NULL], [2, 10, 3]], 'last');
SELECT list_aggregate([2, 4, 8, 42], 'string_agg', '|');
The following is a list of existing rewrites. Rewrites simplify the use of the list aggregate function by only taking the list (column) as their argument. list_avg
, list_var_samp
, list_var_pop
, list_stddev_pop
, list_stddev_samp
, list_sem
, list_approx_count_distinct
, list_bit_xor
, list_bit_or
, list_bit_and
, list_bool_and
, list_bool_or
, list_count
, list_entropy
, list_last
, list_first
, list_kurtosis
, list_kurtosis_pop
, list_min
, list_max
, list_product
, list_skewness
, list_sum
, list_string_agg
, list_mode
, list_median
, list_mad
and list_histogram
.
SELECT list_min([1, 2, -4, NULL]);
SELECT list_sum([2, 4, 8, 42]);
SELECT list_last([[1, 2], [NULL], [2, 10, 3]]);
Concatenates list/array elements using an optional delimiter.
SELECT array_to_string([1, 2, 3], '-') AS str;
-- this is equivalent to the following SQL
SELECT list_aggr([1, 2, 3], 'string_agg', '-') AS str;
The function list_sort
sorts the elements of a list either in ascending or descending order. In addition, it allows to provide whether NULL values should be moved to the beginning or to the end of the list.
By default if no modifiers are provided, DuckDB sorts ASC NULLS LAST
, i.e., the values are sorted in ascending order and NULL values are placed first. This is identical to the default sort order of SQLite. The default sort order can be changed using PRAGMA
statements.
list_sort
leaves it open to the user whether they want to use the default sort order or a custom order. list_sort
takes up to two additional optional parameters. The second parameter provides the sort order and can be either ASC
or DESC
. The third parameter provides the NULL
sort order and can be either NULLS FIRST
or NULLS LAST
.
-- default sort order and default NULL sort order
SELECT list_sort([1, 3, NULL, 5, NULL, -5]);
[NULL, NULL, -5, 1, 3, 5]
-- only providing the sort order
SELECT list_sort([1, 3, NULL, 2], 'ASC');
-- providing the sort order and the NULL sort order
SELECT list_sort([1, 3, NULL, 2], 'DESC', 'NULLS FIRST');
list_reverse_sort
has an optional second parameter providing the NULL
sort order. It can be either NULLS FIRST
or NULLS LAST
.
-- default NULL sort order
SELECT list_sort([1, 3, NULL, 5, NULL, -5]);
[NULL, NULL, -5, 1, 3, 5]
-- providing the NULL sort order
SELECT list_reverse_sort([1, 3, NULL, 2], 'NULLS LAST');
DuckDB supports lambda functions in the form (parameter1, parameter2, ...) -> expression
.
For details, see the lambda functions page.
The flatten function is a scalar function that converts a list of lists into a single list by concatenating each sub-list together.
Note that this only flattens one level at a time, not all levels of sub-lists.
-- Convert a list of lists into a single list
SELECT
flatten([
[1, 2],
[3, 4]
]);
-- If the list has multiple levels of lists,
-- only the first level of sub-lists is concatenated into a single list
SELECT
flatten([
[
[1, 2],
[3, 4],
],
[
[5, 6],
[7, 8],
]
]);
[[1, 2], [3, 4], [5, 6], [7, 8]]
In general, the input to the flatten function should be a list of lists (not a single level list).
However, the behavior of the flatten function has specific behavior when handling empty lists and NULL
values.
-- If the input list is empty, return an empty list
SELECT flatten([]);
-- If the entire input to flatten is NULL, return NULL
SELECT flatten(NULL);
-- If a list whose only entry is NULL is flattened, return an empty list
SELECT flatten([NULL]);
-- If the sub-list in a list of lists only contains NULL,
-- do not modify the sub-list
-- (Note the extra set of parentheses vs. the prior example)
SELECT flatten([[NULL]]);
-- Even if the only contents of each sub-list is NULL,
-- still concatenate them together
-- Note that no de-duplication occurs when flattening.
-- See list_distinct function for de-duplication.
SELECT flatten([[NULL],[NULL]]);
The generate_subscripts(arr, dim)
function generates indexes along the dim
th dimension of array arr
.
SELECT generate_subscripts([4, 5, 6], 1) AS i;
There are also aggregate functions list
and histogram
that produces lists and lists of structs.
The unnest
function is used to unnest a list by one level.
Last modified: 2024-04-18