- Installation
- Documentation
- Getting Started
- Connect
- Data Import
- Overview
- Data Sources
- CSV Files
- JSON Files
- Overview
- Creating JSON
- Loading JSON
- Writing JSON
- JSON Type
- JSON Functions
- Format Settings
- Installing and Loading
- SQL to / from JSON
- Caveats
- Multiple Files
- Parquet Files
- Partitioning
- Appender
- INSERT Statements
- Client APIs
- Overview
- C
- Overview
- Startup
- Configuration
- Query
- Data Chunks
- Vectors
- Values
- Types
- Prepared Statements
- Appender
- Table Functions
- Replacement Scans
- API Reference
- C++
- CLI
- Dart
- Go
- Java
- Julia
- Node.js (Neo)
- Node.js
- Python
- Overview
- Data Ingestion
- Conversion between DuckDB and Python
- DB API
- Relational API
- Function API
- Types API
- Expression API
- Spark API
- API Reference
- Known Python Issues
- R
- Rust
- Swift
- Wasm
- ADBC
- ODBC
- SQL
- Introduction
- Statements
- Overview
- ANALYZE
- ALTER TABLE
- ALTER VIEW
- ATTACH and DETACH
- CALL
- CHECKPOINT
- COMMENT ON
- COPY
- CREATE INDEX
- CREATE MACRO
- CREATE SCHEMA
- CREATE SECRET
- CREATE SEQUENCE
- CREATE TABLE
- CREATE VIEW
- CREATE TYPE
- DELETE
- DESCRIBE
- DROP
- EXPORT and IMPORT DATABASE
- INSERT
- LOAD / INSTALL
- PIVOT
- Profiling
- SELECT
- SET / RESET
- SET VARIABLE
- SUMMARIZE
- Transaction Management
- UNPIVOT
- UPDATE
- USE
- VACUUM
- Query Syntax
- SELECT
- FROM and JOIN
- WHERE
- GROUP BY
- GROUPING SETS
- HAVING
- ORDER BY
- LIMIT and OFFSET
- SAMPLE
- Unnesting
- WITH
- WINDOW
- QUALIFY
- VALUES
- FILTER
- Set Operations
- Prepared Statements
- Data Types
- Overview
- Array
- Bitstring
- Blob
- Boolean
- Date
- Enum
- Interval
- List
- Literal Types
- Map
- NULL Values
- Numeric
- Struct
- Text
- Time
- Timestamp
- Time Zones
- Union
- Typecasting
- Expressions
- Overview
- CASE Statement
- Casting
- Collations
- Comparisons
- IN Operator
- Logical Operators
- Star Expression
- Subqueries
- Functions
- Overview
- Aggregate Functions
- Array Functions
- Bitstring Functions
- Blob Functions
- Date Format Functions
- Date Functions
- Date Part Functions
- Enum Functions
- Interval Functions
- Lambda Functions
- List Functions
- Map Functions
- Nested Functions
- Numeric Functions
- Pattern Matching
- Regular Expressions
- Struct Functions
- Text Functions
- Time Functions
- Timestamp Functions
- Timestamp with Time Zone Functions
- Union Functions
- Utility Functions
- Window Functions
- Constraints
- Indexes
- Meta Queries
- DuckDB's SQL Dialect
- Overview
- Indexing
- Friendly SQL
- Keywords and Identifiers
- Order Preservation
- PostgreSQL Compatibility
- SQL Quirks
- Samples
- Configuration
- Extensions
- Overview
- Core Extensions
- Community Extensions
- Working with Extensions
- Versioning of Extensions
- Arrow
- AutoComplete
- AWS
- Azure
- Delta
- Excel
- Full Text Search
- httpfs (HTTP and S3)
- Iceberg
- ICU
- inet
- jemalloc
- MySQL
- PostgreSQL
- Spatial
- SQLite
- Substrait
- TPC-DS
- TPC-H
- VSS
- Guides
- Overview
- Data Viewers
- Database Integration
- File Formats
- Overview
- CSV Import
- CSV Export
- Directly Reading Files
- Excel Import
- Excel Export
- JSON Import
- JSON Export
- Parquet Import
- Parquet Export
- Querying Parquet Files
- Network and Cloud Storage
- Overview
- HTTP Parquet Import
- S3 Parquet Import
- S3 Parquet Export
- S3 Iceberg Import
- S3 Express One
- GCS Import
- Cloudflare R2 Import
- DuckDB over HTTPS / S3
- Meta Queries
- Describe Table
- EXPLAIN: Inspect Query Plans
- EXPLAIN ANALYZE: Profile Queries
- List Tables
- Summarize
- DuckDB Environment
- ODBC
- Performance
- Overview
- Environment
- Import
- Schema
- Indexing
- Join Operations
- File Formats
- How to Tune Workloads
- My Workload Is Slow
- Benchmarks
- Python
- Installation
- Executing SQL
- Jupyter Notebooks
- SQL on Pandas
- Import from Pandas
- Export to Pandas
- Import from Numpy
- Export to Numpy
- SQL on Arrow
- Import from Arrow
- Export to Arrow
- Relational API on Pandas
- Multiple Python Threads
- Integration with Ibis
- Integration with Polars
- Using fsspec Filesystems
- SQL Editors
- SQL Features
- Snippets
- Glossary of Terms
- Browsing Offline
- Operations Manual
- Overview
- Limits
- Non-Deterministic Behavior
- Embedding DuckDB
- DuckDB's Footprint
- Securing DuckDB
- Development
- DuckDB Repositories
- Testing
- Overview
- sqllogictest Introduction
- Writing Tests
- Debugging
- Result Verification
- Persistent Testing
- Loops
- Multiple Connections
- Catch
- Profiling
- Release Calendar
- Building DuckDB
- Overview
- Build Configuration
- Building Extensions
- Android
- Linux
- macOS
- Raspberry Pi
- Windows
- Troubleshooting
- Benchmark Suite
- Internals
- Sitemap
- Why DuckDB
- Media
- FAQ
- Code of Conduct
- Live Demo
An ARRAY
column stores fixed-sized arrays. All fields in the column must have the same length and the same underlying type. Arrays are typically used to store arrays of numbers, but can contain any uniform data type, including ARRAY
, LIST
and STRUCT
types.
Arrays can be used to store vectors such as word embeddings or image embeddings.
To store variable-length lists, use the LIST
type. See the data types overview for a comparison between nested data types.
The
ARRAY
type in PostgreSQL allows variable-length fields. DuckDB'sARRAY
type is fixed-length.
Creating Arrays
Arrays can be created using the array_value(expr, ...)
function.
Construct with the array_value
function:
SELECT array_value(1, 2, 3);
You can always implicitly cast an array to a list (and use list functions, like list_extract
, [i]
):
SELECT array_value(1, 2, 3)[2];
You can cast from a list to an array (the dimensions have to match):
SELECT [3, 2, 1]::INTEGER[3];
Arrays can be nested:
SELECT array_value(array_value(1, 2), array_value(3, 4), array_value(5, 6));
Arrays can store structs:
SELECT array_value({'a': 1, 'b': 2}, {'a': 3, 'b': 4});
Defining an Array Field
Arrays can be created using the ⟨TYPE_NAME⟩[⟨LENGTH⟩]
syntax. For example, to create an array field for 3 integers, run:
CREATE TABLE array_table (id INTEGER, arr INTEGER[3]);
INSERT INTO array_table VALUES (10, [1, 2, 3]), (20, [4, 5, 6]);
Retrieving Values from Arrays
Retrieving one or more values from an array can be accomplished using brackets and slicing notation, or through list functions like list_extract
and array_extract
. Using the example in Defining an Array Field.
The following queries for extracting the second element of an array are equivalent:
SELECT id, arr[1] AS element FROM array_table;
SELECT id, list_extract(arr, 1) AS element FROM array_table;
SELECT id, array_extract(arr, 1) AS element FROM array_table;
id | element |
---|---|
10 | 1 |
20 | 4 |
Using the slicing notation returns a LIST
:
SELECT id, arr[1:2] AS elements FROM array_table;
id | elements |
---|---|
10 | [1, 2] |
20 | [4, 5] |
Functions
All LIST
functions work with the ARRAY
type. Additionally, several ARRAY
-native functions are also supported.
See the ARRAY
functions.
Examples
Create sample data:
CREATE TABLE x (i INTEGER, v FLOAT[3]);
CREATE TABLE y (i INTEGER, v FLOAT[3]);
INSERT INTO x VALUES (1, array_value(1.0::FLOAT, 2.0::FLOAT, 3.0::FLOAT));
INSERT INTO y VALUES (1, array_value(2.0::FLOAT, 3.0::FLOAT, 4.0::FLOAT));
Compute cross product:
SELECT array_cross_product(x.v, y.v)
FROM x, y
WHERE x.i = y.i;
Compute cosine similarity:
SELECT array_cosine_similarity(x.v, y.v)
FROM x, y
WHERE x.i = y.i;
Ordering
The ordering of ARRAY
instances is defined using a lexicographical order. NULL
values compare greater than all other values and are considered equal to each other.
See Also
For more functions, see List Functions.