libmagic/file utilities ported to DuckDB
Installing and Loading
INSTALL magic FROM community;
LOAD magic;
Example
--- Discover autodetected types for files in a local folder
SELECT magic_mime(file), magic_type(file), file
FROM glob('path/to/folder/**');
--- Discover autodetected types for a remote file
LOAD httpfs; --- this needs to currently be explcit once per session
SELECT magic_mime(file), magic_type(file), file
FROM glob('https://raw.githubusercontent.com/duckdb/duckdb/main/data/parquet-testing/adam_genotypes.parquet');
--- Read file without providing detail on type
FROM read_any('https://raw.githubusercontent.com/duckdb/duckdb/main/data/parquet-testing/adam_genotypes.parquet');
About magic
Very experimental port of libmagic (that powers file UNIX utility), allow to classify files based on the the content of the header, accoring to the libmagic library. Packaged with version 5.45 of the magic library. The magic.mgc database is at the moment statically compiled in the library, so it's the same across platforms but immutable. Currently not available in Windows and Wasm, due to different but likely solvable vc-packaging issue, to be sorted out independently.
Added Functions
function_name | function_type | description | comment | example |
---|---|---|---|---|
magic_mime | scalar | |||
magic_type | scalar | |||
read_any | table_macro |