Search Shortcut cmd + k | ctrl + k
pbix

Extension that allows parsing the data model embedded in PowerBI (pbix) files

Maintainer(s): Hugoberry

Installing and Loading

INSTALL pbix FROM community;
LOAD pbix;

Example

-- Get metadata tables from a PowerBI file
SELECT Name FROM pbix_meta('Adventure Works DW 2020.pbix','table') WHERE isHidden=0;

-- Read data from a specific table in the PowerBI file
SELECT 
  ResellerKey, 
  "Business Type", 
  Reseller, 
  "Reseller ID" 
FROM pbix_read('Adventure Works DW 2020.pbix','Reseller') 
LIMIT 10;

-- Read metadata about models in a folder of pbix files
SELECT
  file,
  list_transform(pbix2vpax(file).Tables, t->t.TableName) as tab
FROM glob('data/**/*.pbix');

About pbix

The PBIX extension allows you to parse the data model embedded in PowerBI (pbix) files directly in DuckDB.

It provides three functions: - pbix_meta(): Returns metadata tables for a data model (consult MS-SSAS-T for metadata structures) - pbix_read(): Returns the contents of a specific table from a pbix file - pbix2vpax(): Generate comprehensive VPAX serialisation of the entire data model (scalar function)

For a pure Python implementation of the pbix parser, check out the PBIXray library.

Note: Current limitations include the inability of the WASM version to parse https hosted files, and that pbix_read() will decompress the entire model in memory.

Added Functions

function_name function_type description comment examples
pbix2vpax scalar NULL NULL  
pbix_meta table NULL NULL  
pbix_read table NULL NULL  

Added Settings

name description input_type scope aliases
pbix_ignore_errors Return empty VPAX structure instead of throwing exceptions on parse errors BOOLEAN GLOBAL []
pbix_trailing_chunks_optimization Number of trailing chunks to read for optimized file parsing INTEGER GLOBAL []