Deserialize Protocol Buffer messages stored in DuckDB columns
Installing and Loading
INSTALL protoduck FROM community;
LOAD protoduck;
Example
INSTALL protoduck FROM community;
LOAD protoduck;
-- Register a protobuf schema at runtime
SELECT proto_schema_add('
syntax = "proto3";
package demo;
message Person {
string name = 1;
int32 age = 2;
}
');
-- Decode a protobuf BLOB to JSON
SELECT proto_to_json(data, 'demo.Person') FROM people;
-- Extract a single field via dot-notation path
SELECT proto_get(data, 'demo.Person', 'name') FROM people;
About protoduck
ProtoDuck is a DuckDB extension for deserializing Protocol Buffer messages stored directly in table columns (BLOBs), rather than from files. It is aimed at querying protobuf-encoded payloads in data lakes and analytics pipelines.
Schemas are registered at runtime via proto_schema_add (from .proto
source) or proto_schema_add_binary (from a compiled FileDescriptorSet).
Once registered, messages can be decoded to JSON with proto_to_json /
proto_decode, or individual fields can be pulled out with proto_get
using a path syntax that supports nested messages, repeated fields
(items[0]), and maps (metadata["key"]).
All protobuf scalar types, enums, oneofs, maps, and nested messages are supported.
Added Functions
| function_name | function_type | description | comment | examples |
|---|---|---|---|---|
| proto_decode | scalar | NULL | NULL | |
| proto_describe | scalar | NULL | NULL | |
| proto_get | scalar | NULL | NULL | |
| proto_schema_add | scalar | NULL | NULL | |
| proto_schema_add_binary | scalar | NULL | NULL | |
| proto_to_json | scalar | NULL | NULL |
Overloaded Functions
This extension does not add any function overloads.
Added Types
This extension does not add any types.
Added Settings
This extension does not add any settings.