Read PFC-JSONL compressed log files directly in DuckDB with block-level timestamp filtering
Installing and Loading
INSTALL pfc FROM community;
LOAD pfc;
Example
INSTALL pfc FROM community;
LOAD pfc;
LOAD json;
-- Read a compressed PFC-JSONL log file
SELECT
line->>'$.level' AS level,
line->>'$.message' AS message
FROM read_pfc_jsonl('/path/to/events.pfc')
LIMIT 10;
-- Block-level timestamp filter: only decompress relevant blocks
SELECT count(*) FROM read_pfc_jsonl(
'/path/to/events.pfc',
ts_from = epoch(TIMESTAMPTZ '2026-01-01 00:00:00+00'),
ts_to = epoch(TIMESTAMPTZ '2026-01-02 00:00:00+00')
);
About pfc
The pfc extension lets you query PFC-JSONL compressed log files directly in DuckDB — no decompression step, no intermediate files.
Why PFC-JSONL?
PFC-JSONL achieves better compression than gzip and zstd on real structured
log data by combining BWT + rANS entropy coding tuned for repetitive JSONL patterns.
On top of superior compression, every .pfc file supports random block access —
only the blocks you need are ever decompressed.
Ratios measured on realistic JSONL log data (8 services, mixed log levels):
| Compressor | Compression ratio | vs PFC-JSONL |
|---|---|---|
| gzip | ~12 % | 25 % larger |
| zstd | ~14 % | 37 % larger |
| PFC-JSONL | ~9 % | — |
Key Features
-
Block-level timestamp filtering via
ts_from/ts_toparameters. Entire blocks outside the requested time window are skipped before decompression. A 1-hour query on a 30-day file with hourly blocks reads ~1/720 of the data (~720× speedup). -
No account, no signup, no limits. Free for personal and open-source use. Commercial use requires a license — see ImpossibleForge/pfc-jsonl.
-
Simple API: one column
line VARCHARper row (raw JSON string). Combine with DuckDB's built-injsonextension for field extraction and analytics. -
Linux and macOS only. Windows users: use WSL2.
Requirements
The extension calls the PFC-JSONL binary (v3.4+) for block decompression. Install once:
curl -L https://github.com/ImpossibleForge/pfc-jsonl/releases/latest/download/pfc_jsonl-linux-x64 -o /usr/local/bin/pfc_jsonl && chmod +x /usr/local/bin/pfc_jsonl
The binary is expected at /usr/local/bin/pfc_jsonl.
Override with the PFC_JSONL_BINARY environment variable.
File Format
Each .pfc file must have a companion .pfc.bidx binary block index,
written automatically by PFC-JSONL v3.4+:
pfc_jsonl compress input.jsonl output.pfc
# Produces: output.pfc + output.pfc.bidx
Analytics Example
LOAD pfc;
LOAD json;
-- Error rate per hour for a specific day
SELECT
strftime(to_timestamp((line->>'$.ts')::BIGINT), '%H:00') AS hour,
count(*) FILTER (WHERE line->>'$.level' = 'ERROR') AS errors,
count(*) AS total
FROM read_pfc_jsonl(
'/var/log/api.pfc',
ts_from = epoch(TIMESTAMPTZ '2026-03-15 00:00:00+00'),
ts_to = epoch(TIMESTAMPTZ '2026-03-16 00:00:00+00')
)
GROUP BY hour
ORDER BY hour;
Added Functions
| function_name | function_type | description | comment | examples |
|---|---|---|---|---|
| read_pfc_jsonl | table | NULL | NULL |
Overloaded Functions
This extension does not add any function overloads.
Added Types
This extension does not add any types.
Added Settings
This extension does not add any settings.