Search Shortcut cmd + k | ctrl + k
Documentation
hive_metastore
Connect to your Hive Metastore, attach it as a native DuckDB catalog and query the data inside with ease!
Maintainer(s):
thijs-s
Installing and Loading
INSTALL hive_metastore FROM community;
LOAD hive_metastore;
Example
-- Attach the Hive Metastore as a catalog in DuckDB
ATTACH 'thrift://<host>:<port>' AS <catalog_name> (TYPE hive_metastore);
-- You are ready to rock!
SELECT * FROM <catalog_name>.<schema_name>.<table_name>;
-- For querying tables from object storages, you still need to set up the storage extensions (e.g. s3)
CREATE SECRET s3 (TYPE S3, KEY_ID 'access-key', SECRET 'secret-key', ENDPOINT 'localhost:9000');
About hive_metastore
DuckDB Hive Metastore extension enables DuckDB to connect to Apache Hive Metastore via Thrift protocol and query tables stored in DuckDB-supported formats. The extension provides seamless integration with the Hive ecosystem while leveraging DuckDB's powerful analytical capabilities.
Key Features
- Implementation of Hive catalog as a native DuckDB catalog
- Automatic schema discovery, including support for complex data types such as arrays and maps
- Support for Parquet, CSV, Iceberg, Delta, ORC, and Avro
Usage
The usage of the extension revolves around attaching the Hive Metastore as a catalog in DuckDB and then querying the tables as if they were native DuckDB tables. The attach command looks like any other DuckDB attach command:
ATTACH 'thrift://<host>:<port>' AS <catalog_name> (<args>);
Supported arguments include:
TYPE(required): Must be set tohive_metastoreto indicate that we want to use the Hive Metastore extensionWAREHOUSE_LOCATION: The warehouse location path. Used for table storage location resolution (mostly not required, but can be useful in some cases).DEFAULT_SCHEMA: The database/schema name to use when queries don't specify one. Defaults todefaultif not provided.