Search Shortcut cmd + k | ctrl + k
protoduck

Deserialize Protocol Buffer messages stored in DuckDB columns

Maintainer(s): fcsnk

Installing and Loading

INSTALL protoduck FROM community;
LOAD protoduck;

Example

INSTALL protoduck FROM community;
LOAD protoduck;

-- Register a protobuf schema at runtime
SELECT proto_schema_add('
    syntax = "proto3";
    package demo;
    message Person {
        string name = 1;
        int32 age = 2;
    }
');

-- Decode a protobuf BLOB to JSON
SELECT proto_to_json(data, 'demo.Person') FROM people;

-- Extract a single field via dot-notation path
SELECT proto_get(data, 'demo.Person', 'name') FROM people;

About protoduck

ProtoDuck is a DuckDB extension for deserializing Protocol Buffer messages stored directly in table columns (BLOBs), rather than from files. It is aimed at querying protobuf-encoded payloads in data lakes and analytics pipelines.

Schemas are registered at runtime via proto_schema_add (from .proto source) or proto_schema_add_binary (from a compiled FileDescriptorSet). Once registered, messages can be decoded to JSON with proto_to_json / proto_decode, or individual fields can be pulled out with proto_get using a path syntax that supports nested messages, repeated fields (items[0]), and maps (metadata["key"]).

All protobuf scalar types, enums, oneofs, maps, and nested messages are supported.

Added Functions

function_name function_type description comment examples
proto_decode scalar NULL NULL  
proto_describe scalar NULL NULL  
proto_get scalar NULL NULL  
proto_schema_add scalar NULL NULL  
proto_schema_add_binary scalar NULL NULL  
proto_to_json scalar NULL NULL  

Overloaded Functions

This extension does not add any function overloads.

Added Types

This extension does not add any types.

Added Settings

This extension does not add any settings.