Search Shortcut cmd + k | ctrl + k
sitting_duck

Parse and analyze source code ASTs from 27 programming languages with tree-sitter grammars, pattern matching, and structural search

Maintainer(s): teaguesterling

Installing and Loading

INSTALL sitting_duck FROM community;
LOAD sitting_duck;

Example

-- Parse Python code and find function definitions
SELECT name, start_line, peek
FROM parse_ast('
def hello():
    return "hello world"
def greet(name):
    print(f"Hello, {name}!")
    return name
', 'python')
WHERE is_function_definition(semantic_type);

-- Pattern matching: find eval() calls and capture arguments
SELECT captures['X'].peek as dangerous_input, file_path, start_line
FROM ast_match('code', 'eval(__X__)', 'python');

-- Pattern matching with variadic wildcards
SELECT captures['F'].name as func_name, captures['Y'].peek as return_value
FROM ast_match('code',
    'def __F__(__):
    %__<BODY*>__%
    return __Y__', 'python');

-- Security audit: find dangerous function calls
SELECT * FROM ast_security_audit('code')
WHERE risk_level = 'high';

-- Function complexity metrics
SELECT name, cyclomatic, max_depth, lines
FROM ast_function_metrics('code')
WHERE cyclomatic > 10;

-- Cross-language analysis with semantic types
SELECT language, COUNT(*) as functions
FROM read_ast(['src/**/*.py', 'src/**/*.js', 'src/**/*.go'])
WHERE is_function_definition(semantic_type)
GROUP BY language;

About sitting_duck

Sitting Duck is a DuckDB extension for parsing source code into Abstract Syntax Trees (ASTs) using tree-sitter grammars. It provides a powerful SQL interface to analyze, search, and understand code structure across 25+ programming languages.

Documentation: https://sitting-duck.readthedocs.io/

Core Features

Pattern Matching: Find code structures using pattern-by-example matching with wildcards:

-- Recursive wildcards for deep matching
SELECT * FROM ast_match('code', 'class __C__: %__<**>__% def __M__(self): %__<BODY*>__%', 'python');

-- Optional and negation wildcards
SELECT * FROM ast_match('code', 'def __F__(__<?ARGS>__): %__<BODY*>__%', 'python');

See: https://sitting-duck.readthedocs.io/en/latest/guide/pattern-matching/

Table Functions:

  • read_ast(file_pattern, language := NULL) - Parse source files into AST rows
  • parse_ast(content, language) - Parse source code strings
  • ast_match(source, pattern, lang) - Pattern matching for code search

Relational Predicates (v1.5.0):

  • ast_has(source, parent_type, child_type) - Check containment relationships
  • ast_inside(source, child_type, parent_type, parent_name) - Find nodes inside specific parents
  • ast_precedes(source, before_type, after_type) - Ordering predicates
  • ast_follows(source, after_type, before_type) - Ordering predicates
  • ast_not_has(source, parent_type, child_type) - Negated containment

Analysis Macros (file-path based):

  • ast_definitions(source) - All named definitions with categories
  • ast_function_metrics(source) - Cyclomatic complexity, nesting depth, line counts
  • ast_security_audit(source) - Detect dangerous function call patterns
  • ast_dead_code(source) - Find potentially unused functions/classes
  • ast_nesting_analysis(source) - Identify deeply nested code
  • ast_definition_parent(table) - Resolve nearest definition ancestor per node

Tree Navigation:

  • ast_descendants(table, node_id) - Get subtree (O(1) using descendant_count)
  • ast_ancestors(table, node_id) - Path from node to root
  • ast_children(table, node_id) - Immediate children
  • ast_function_scope(table, node_id) - Function body excluding nested functions

Semantic Predicates: Cross-language filtering with normalized types:

  • is_function_definition(st), is_class_definition(st), is_variable_definition(st)
  • is_function_call(st), is_literal(st), is_conditional(st), is_loop(st)

Supported Languages (27)

Category Languages
Web JavaScript, TypeScript, HTML, CSS
Systems C, C++, Go, Rust, Zig
Scripting Python, Ruby, PHP, Lua, R, Bash
Enterprise Java, C#, Kotlin, Swift
Mobile Dart
Data SQL, JSON, TOML, GraphQL, HCL
Documentation Markdown

AST Schema

Each parsed node includes:

  • type, name - Node type and extracted identifier
  • semantic_type - Normalized type for cross-language queries
  • file_path, language - Source location
  • start_line, end_line, depth - Position and nesting
  • descendant_count - For O(1) subtree queries
  • peek - Configurable source preview
  • qualified_name, signature_type, parameters, modifiers, annotations - Native extraction

Full schema: https://sitting-duck.readthedocs.io/en/latest/api/output-schema/

Use Cases

  • Security auditing - Find dangerous patterns (eval, exec, SQL injection)
  • Code quality - Complexity metrics, dead code detection, nesting analysis
  • Refactoring - Pattern-based code search across large codebases
  • Documentation - Extract function signatures and structure
  • AI workflows - Structured code understanding for LLM tools

GitHub: https://github.com/teaguesterling/sitting_duck

Added Functions

function_name function_type description comment examples
ast_ancestors table_macro NULL NULL  
ast_call_arguments table_macro NULL NULL  
ast_capture macro NULL NULL  
ast_children table_macro NULL NULL  
ast_class_members table_macro NULL NULL  
ast_containing_line table_macro NULL NULL  
ast_dead_code table_macro NULL NULL  
ast_definition_parent table_macro NULL NULL  
ast_definitions table_macro NULL NULL  
ast_descendants table_macro NULL NULL  
ast_follows table_macro NULL NULL  
ast_function_metrics table_macro NULL NULL  
ast_function_scope table_macro NULL NULL  
ast_functions_containing table_macro NULL NULL  
ast_get_source macro NULL NULL  
ast_get_source_line macro NULL NULL  
ast_get_source_numbered macro NULL NULL  
ast_has table_macro NULL NULL  
ast_in_range table_macro NULL NULL  
ast_inside table_macro NULL NULL  
ast_match table_macro NULL NULL  
ast_nesting_analysis table_macro NULL NULL  
ast_not_has table_macro NULL NULL  
ast_pattern table_macro NULL NULL  
ast_pattern_list macro NULL NULL  
ast_peek_contains_any scalar NULL NULL  
ast_precedes table_macro NULL NULL  
ast_security_audit table_macro NULL NULL  
ast_siblings table_macro NULL NULL  
ast_source_of table_macro NULL NULL  
ast_supported_languages table NULL NULL  
clean_pattern macro NULL NULL  
detect_language scalar NULL NULL  
extract_wildcard_rules macro NULL NULL  
get_kind scalar NULL NULL  
get_searchable_types scalar NULL NULL  
get_super_kind scalar NULL NULL  
get_variadic_names macro NULL NULL  
has_body scalar NULL NULL  
is_annotation macro NULL NULL  
is_arithmetic macro NULL NULL  
is_assignment macro NULL NULL  
is_block macro NULL NULL  
is_boolean_literal macro NULL NULL  
is_call scalar NULL NULL  
is_class_definition macro NULL NULL  
is_comment macro NULL NULL  
is_comparison macro NULL NULL  
is_conditional macro NULL NULL  
is_construct scalar NULL NULL  
is_control_flow scalar NULL NULL  
is_declaration_only scalar NULL NULL  
is_definition scalar NULL NULL  
is_directive macro NULL NULL  
is_embodied scalar NULL NULL  
is_export macro NULL NULL  
is_foreign macro NULL NULL  
is_function_call macro NULL NULL  
is_function_definition macro NULL NULL  
is_identifier scalar NULL NULL  
is_import macro NULL NULL  
is_jump macro NULL NULL  
is_kind scalar NULL NULL  
is_list macro NULL NULL  
is_literal macro NULL NULL  
is_logical macro NULL NULL  
is_loop macro NULL NULL  
is_member_access macro NULL NULL  
is_module_definition macro NULL NULL  
is_number_literal macro NULL NULL  
is_parser_specific scalar NULL NULL  
is_pattern_wildcard macro NULL NULL  
is_punctuation scalar NULL NULL  
is_semantic_type scalar NULL NULL  
is_string_literal macro NULL NULL  
is_syntax_only scalar NULL NULL  
is_type_composite macro NULL NULL  
is_type_definition macro NULL NULL  
is_type_generic macro NULL NULL  
is_type_primitive macro NULL NULL  
is_type_reference macro NULL NULL  
is_variable_definition macro NULL NULL  
kind_code scalar NULL NULL  
parse_ast table NULL NULL  
parse_ast_flat table NULL NULL  
parse_ast_hierarchical table NULL NULL  
parse_html_wildcard macro NULL NULL  
pattern_has_recursive macro NULL NULL  
pattern_has_variadic macro NULL NULL  
read_ast table NULL NULL  
read_ast_flat table NULL NULL  
read_ast_hierarchical table NULL NULL  
read_ast_hierarchical_new table NULL NULL  
semantic_type_base macro NULL NULL  
semantic_type_code scalar NULL NULL  
semantic_type_to_string scalar NULL NULL  
string_contains_any scalar NULL NULL  
string_contains_any_i scalar NULL NULL  
wildcard_capture_name macro NULL NULL  

Overloaded Functions

This extension does not add any function overloads.

Added Types

This extension does not add any types.

Added Settings

This extension does not add any settings.