Extension that allows parsing the data model embedded in PowerBI (pbix) files
Maintainer(s):
Hugoberry
Installing and Loading
INSTALL pbix FROM community;
LOAD pbix;
Example
-- Get metadata tables from a PowerBI file
SELECT Name FROM pbix_meta('Adventure Works DW 2020.pbix','table') WHERE isHidden=0;
-- Read data from a specific table in the PowerBI file
SELECT
ResellerKey,
"Business Type",
Reseller,
"Reseller ID"
FROM pbix_read('Adventure Works DW 2020.pbix','Reseller')
LIMIT 10;
-- Read metadata about models in a folder of pbix files
SELECT
file,
list_transform(pbix2vpax(file).Tables, t->t.TableName) as tab
FROM glob('data/**/*.pbix');
About pbix
The PBIX extension allows you to parse the data model embedded in PowerBI (pbix) files directly in DuckDB.
It provides three functions: - pbix_meta()
: Returns metadata tables for a data model (consult MS-SSAS-T for metadata structures) - pbix_read()
: Returns the contents of a specific table from a pbix file - pbix2vpax()
: Generate comprehensive VPAX serialisation of the entire data model (scalar function)
For a pure Python implementation of the pbix parser, check out the PBIXray library.
Note: Current limitations include the inability of the WASM version to parse https
hosted files, and that pbix_read() will decompress the entire model in memory.
Added Functions
function_name | function_type | description | comment | examples |
---|---|---|---|---|
pbix2vpax | scalar | NULL | NULL | |
pbix_meta | table | NULL | NULL | |
pbix_read | table | NULL | NULL |
Added Settings
name | description | input_type | scope | aliases |
---|---|---|---|---|
pbix_ignore_errors | Return empty VPAX structure instead of throwing exceptions on parse errors | BOOLEAN | GLOBAL | [] |
pbix_trailing_chunks_optimization | Number of trailing chunks to read for optimized file parsing | INTEGER | GLOBAL | [] |