A DuckDB extension to read and write RDF
Installing and Loading
INSTALL rdf FROM community;
LOAD rdf;
Example
-- 0. Assuming the extension is already installed and loaded
-- 1. Get number of ntriples in a directory
SELECT COUNT(*) FROM read_rdf('data/shards/*.nt');
-- 2. Get subjects and predicates of a turtle file
SELECT subject, predicate FROM read_rdf('test/rdf/tests.ttl');
-- 3. Write a query to turtle RDF, using R2RML mapping
COPY (SELECT empno, ename, deptno FROM emp)
TO 'output.nt'
(FORMAT r2rml, mapping 'mapping.ttl');
-- 4. Execute a full R2RML mapping (with embedded queries) to write RDF
COPY (SELECT 1) TO 'output.nt' (FORMAT r2rml, mapping 'mapping.ttl');
-- 5. Check if an R2RML mapping is valid
SELECT is_valid_r2rml('mapping.ttl');
About rdf
The duck_rdf extension enables DuckDB to read and write RDF (Resource Description Framework)
data directly, using the SERD library for
parsing and serialization.
Supported Formats
Read: Turtle (.ttl), NTriples (.nt), NQuads (.nq), TriG (.trig), and
RDF/XML (.rdf/.xml, experimental — read-only).
Write: NTriples, Turtle, NQuads (via R2RML mapping).
Reading RDF
read_rdf() returns six columns: subject, predicate, object (always populated),
and graph, language_tag, datatype (nullable). It accepts a file path or glob pattern;
multiple matched files are scanned in parallel.
SELECT subject, predicate FROM read_rdf('data.ttl');
SELECT COUNT(*) FROM read_rdf('data/shards/*.nt');
SELECT * FROM read_rdf('data/*.dat', file_type = 'ttl', strict_parsing = false);
Optional parameters:
| Parameter | Default | Description |
|---|---|---|
strict_parsing |
true |
Set to false to allow malformed URIs |
prefix_expansion |
false |
Expand CURIE-form URIs to full URIs (Turtle/TriG only) |
file_type |
auto-detected | Override format: ttl, nt, nq, trig, rdf/xml |
The experimental read_sparql(endpoint, query) sends a SPARQL SELECT query to a remote endpoint and returns the result set as a DuckDB table. Column names are derived from the SPARQL variable names; all columns are VARCHAR. Unbound variables are returned as empty strings.
-- Count number of humans in wikidata
SELECT * FROM read_sparql(
'https://query.wikidata.org/sparql',
'SELECT (COUNT(*) AS ?count) WHERE { ?item wdt:P31 wd:Q5 .}'
);
Writing RDF (R2RML)
Write RDF using R2RML mapping files with DuckDB's COPY TO
syntax. Two modes are supported:
Inside-out mode — DuckDB drives the query; the mapping has no rr:logicalTable:
COPY (SELECT empno, ename, deptno FROM emp)
TO 'output.nt' (FORMAT r2rml, mapping 'mapping.ttl');
Full R2RML mode — the mapping defines its own queries:
COPY (SELECT 1) TO 'output.nt' (FORMAT r2rml, mapping 'mapping.ttl');
Write options:
| Option | Required | Default | Description |
|---|---|---|---|
mapping |
Yes | — | Path to R2RML mapping file (.ttl) |
rdf_format |
No | ntriples |
Output format: ntriples, turtle, or nquads |
ignore_non_fatal_errors |
No | true |
Raise an exception on the first parse error when false |
Validation Helpers
SELECT is_valid_r2rml('mapping.ttl'); -- validate an R2RML mapping file
SELECT can_call_inside_out('mapping.ttl'); -- check if inside-out mode is supported
Added Functions
| function_name | function_type | description | comment | examples |
|---|---|---|---|---|
| can_call_inside_out | scalar | NULL | NULL | |
| is_valid_r2rml | scalar | NULL | NULL | |
| read_rdf | table | NULL | NULL | |
| read_sparql | table | NULL | NULL |
Overloaded Functions
| function_name | function_type | description | comment | examples | |—————|—————|————-|———|———-|
Added Types
| type_name | type_size | logical_type | type_category | internal | |———–|———-:|————–|—————|———-|
Added Settings
| name | description | input_type | scope | aliases | |——|————-|————|——-|———|