Search Shortcut cmd + k | ctrl + k
yaml

Read YAML files into DuckDB with native YAML type support, comprehensive extraction functions, and seamless JSON interoperability

Maintainer(s): teaguesterling

Installing and Loading

INSTALL yaml FROM community;
LOAD yaml;

Example

-- Load the extension
LOAD yaml;

-- Query YAML files directly
SELECT * FROM 'config.yaml';
SELECT * FROM 'data/*.yml' WHERE active = true;

-- Create tables with YAML columns
CREATE TABLE configs(id INTEGER, config YAML);
INSERT INTO configs VALUES (1, E'server: production\nport: 8080\nfeatures: [logging, metrics]');

-- Extract data using YAML functions
SELECT 
    yaml_extract_string(config, '$.server') AS environment,
    yaml_extract(config, '$.port') AS port,
    yaml_extract(config, '$.features[0]') AS first_feature
FROM configs;

-- Convert between YAML and JSON
SELECT yaml_to_json(config) AS json_config FROM configs;
SELECT value_to_yaml({name: 'John', age: 30}) AS yaml_person;

-- Write query results to YAML
COPY (SELECT * FROM users) TO 'output.yaml' (FORMAT yaml, STYLE block);

About yaml

The YAML extension brings comprehensive YAML support to DuckDB, enabling seamless integration of YAML data within SQL queries.

Key Features:

  • Native YAML Type: Full YAML type support with automatic casting between YAML, JSON, and VARCHAR
  • File Reading: Read YAML files with read_yaml() and read_yaml_objects() functions supporting multi-document files, top-level sequences, and robust error handling
  • Direct File Querying: Query YAML files directly using FROM 'file.yaml' syntax
  • Extraction Functions: Query YAML data with yaml_extract(), yaml_type(), yaml_exists(), and path-based extraction
  • Type Detection: Comprehensive automatic type detection for temporal types (DATE, TIME, TIMESTAMP), optimal numeric types, and boolean values
  • Column Type Specification: Explicitly define column types when reading YAML files for schema consistency
  • YAML Output: Write query results to YAML files using COPY TO with configurable formatting styles
  • Multi-Document Support: Handle files with multiple YAML documents separated by ---
  • Error Recovery: Continue processing valid documents even when some contain errors
  • JSON Interoperability: Seamless conversion between YAML and JSON formats

Example Use Cases:

  • Configuration file management and querying
  • Log file analysis and processing
  • Data migration between YAML and relational formats
  • Integration with YAML-based CI/CD pipelines
  • Processing Kubernetes manifests and Helm charts

The extension is built using yaml-cpp and follows DuckDB's extension development best practices, ensuring reliable performance and cross-platform compatibility.

Note: This extension was written primarily using Claude and Claude Code as an exercise in AI-driven development.

Added Functions

function_name function_type description comment examples
copy_format_yaml scalar NULL NULL  
format_yaml scalar NULL NULL  
read_yaml table NULL NULL  
read_yaml_objects table NULL NULL  
value_to_yaml scalar NULL NULL  
yaml_exists scalar NULL NULL  
yaml_extract scalar NULL NULL  
yaml_extract_string scalar NULL NULL  
yaml_get_default_style scalar NULL NULL  
yaml_set_default_style scalar NULL NULL  
yaml_to_json scalar NULL NULL  
yaml_type scalar NULL NULL  
yaml_valid scalar NULL NULL  

Added Types

type_name type_size logical_type type_category internal
yaml 16 VARCHAR STRING true