Search Shortcut cmd + k | ctrl + k
magic

libmagic/file utilities ported to DuckDB

Maintainer(s): carlopi

Installing and Loading

INSTALL magic FROM community;
LOAD magic;

Example

--- Discover autodetected types for files in a folder
SELECT magic_mime(file), magic_type(file), file
    FROM glob('path/to/folder/**');

--- Inspect auto-detection capabilities of the `magic` extension
FROM magic_capabilities();

--- A remote GeoPackage — spatial, sniffed by content (runnable as-is):
FROM read_any('https://www.geopackage.org/data/gdal_sample_v1.2_no_extensions.gpkg');

--- A table inside a Postgres database (bring your own connection string):
FROM read_any('postgres://user:pass@host:5432/[email protected]');

--- An Iceberg table on AWS S3 Tables (bring your own ARN + CREATE SECRET):
FROM read_any('arn:aws:s3tables:us-east-1:123456789012:bucket/[email protected]');

About magic

Port of libmagic (that powers file UNIX utility), allow to classify files based on the content of the header, accoring to the libmagic library. Packaged with version 5.45 of the magic library. The magic.mgc database is at the moment statically compiled in the library, so it's the same across platforms but immutable. Implements read-from-any file capabilties, via magic auto-detection or name pattern. Implements also sub-resource dispatcher via @schema.tbl, allow also catalogs to be queried as a table.

Added Functions

function_name function_type description comment examples
magic_archive_members scalar NULL NULL  
magic_capabilities table_macro NULL NULL  
magic_mime scalar Returns the MIME type for the given file path using the libmagic database (e.g. 'application/json', 'text/plain'). NULL [SELECT magic_mime('myfile.json');, SELECT file, magic_mime(file) AS mime FROM glob('data/*/');]
magic_required_extensions scalar Returns the list of DuckDB extensions that must be loaded before reading the given file with read_any(). Returns an empty list for built-in formats (CSV, blob). NULL [SELECT magic_required_extensions('myfile.json');, SELECT file, magic_required_extensions(file) AS exts FROM glob('data/*/');]
magic_type scalar Returns the file type description for the given file path using the libmagic database (e.g. 'Apache Parquet', 'JSON data'). NULL [SELECT magic_type('myfile.parquet');, SELECT file, magic_type(file) AS type FROM glob('data/*/');]
read_any table_macro NULL NULL  
read_any_impl table_macro NULL NULL  
read_attacheable_database table NULL NULL  
read_har table_macro NULL NULL  
split_into_components scalar NULL NULL  

Overloaded Functions

This extension does not add any function overloads.

Added Types

This extension does not add any types.

Added Settings

This extension does not add any settings.