Parse and analyze source code ASTs from 25+ programming languages with tree-sitter grammars and pattern matching
Installing and Loading
INSTALL sitting_duck FROM community;
LOAD sitting_duck;
Example
-- Parse Python code and find function definitions
SELECT name, start_line, peek
FROM parse_ast('
def hello():
return "hello world"
def greet(name):
print(f"Hello, {name}!")
return name
', 'python')
WHERE is_function_definition(semantic_type);
-- Pattern matching: find eval() calls and capture arguments
SELECT captures['X'].peek as dangerous_input, file_path, start_line
FROM ast_match('code', 'eval(__X__)', 'python');
-- Pattern matching with variadic wildcards
SELECT captures['F'].name as func_name, captures['Y'].peek as return_value
FROM ast_match('code',
'def __F__(__):
%__<BODY*>__%
return __Y__', 'python');
-- Security audit: find dangerous function calls
SELECT * FROM ast_security_audit('code')
WHERE risk_level = 'high';
-- Function complexity metrics
SELECT name, cyclomatic, max_depth, lines
FROM ast_function_metrics('code')
WHERE cyclomatic > 10;
-- Cross-language analysis with semantic types
SELECT language, COUNT(*) as functions
FROM read_ast(['src/**/*.py', 'src/**/*.js', 'src/**/*.go'])
WHERE is_function_definition(semantic_type)
GROUP BY language;
About sitting_duck
Sitting Duck is a DuckDB extension for parsing source code into Abstract Syntax Trees (ASTs) using tree-sitter grammars. It provides a powerful SQL interface to analyze, search, and understand code structure across 25+ programming languages.
Documentation: https://sitting-duck.readthedocs.io/
Core Features
Pattern Matching (New in v1.3.0): Find code structures using pattern-by-example matching with wildcards:
-- Find all eval() calls
SELECT * FROM ast_match('code', 'eval(__X__)', 'python');
-- Variadic patterns for flexible matching
SELECT * FROM ast_match('code', 'def __F__(__): %__<BODY*>__% return __Y__', 'python');
See: https://sitting-duck.readthedocs.io/en/latest/guide/pattern-matching/
Table Functions:
read_ast(file_pattern, language := NULL)- Parse source files into AST rowsparse_ast(content, language)- Parse source code stringsast_match(table, pattern, lang)- Pattern matching for code search
Analysis Macros:
ast_function_metrics(table)- Cyclomatic complexity, nesting depth, line countsast_security_audit(table)- Detect eval, exec, pickle, SQL injection patternsast_dead_code(table)- Find potentially unused functions/classesast_nesting_analysis(table)- Identify deeply nested codeast_definitions(table)- All named definitions with categories
Tree Navigation:
ast_descendants(table, node_id)- Get subtree (O(1) using descendant_count)ast_ancestors(table, node_id)- Path from node to rootast_children(table, node_id)- Immediate childrenast_function_scope(table, node_id)- Function body excluding nested functions
Semantic Predicates: Cross-language filtering with normalized types:
is_function_definition(st),is_class_definition(st),is_variable_definition(st)is_function_call(st),is_literal(st),is_conditional(st),is_loop(st)
Supported Languages (27)
| Category | Languages |
|---|---|
| Web | JavaScript, TypeScript, HTML, CSS |
| Systems | C, C++, Go, Rust, Zig |
| Scripting | Python, Ruby, PHP, Lua, R, Bash |
| Enterprise | Java, C#, Kotlin, Swift |
| Mobile | Dart |
| Data | SQL, JSON, TOML, GraphQL, HCL |
| Documentation | Markdown |
AST Schema
Each parsed node includes:
type,name- Node type and extracted identifiersemantic_type- Normalized type for cross-language queriesfile_path,language- Source locationstart_line,end_line,depth- Position and nestingdescendant_count- For O(1) subtree queriespeek- Configurable source previewqualified_name,signature_type,parameters,modifiers,annotations- Native extraction
Full schema: https://sitting-duck.readthedocs.io/en/latest/api/output-schema/
Use Cases
- Security auditing - Find dangerous patterns (eval, exec, SQL injection)
- Code quality - Complexity metrics, dead code detection, nesting analysis
- Refactoring - Pattern-based code search across large codebases
- Documentation - Extract function signatures and structure
- AI workflows - Structured code understanding for LLM tools
GitHub: https://github.com/teaguesterling/sitting_duck
Added Functions
| function_name | function_type | description | comment | examples |
|---|---|---|---|---|
| ast_ancestors | table_macro | NULL | NULL | |
| ast_call_arguments | table_macro | NULL | NULL | |
| ast_children | table_macro | NULL | NULL | |
| ast_class_members | table_macro | NULL | NULL | |
| ast_containing_line | table_macro | NULL | NULL | |
| ast_dead_code | table_macro | NULL | NULL | |
| ast_definitions | table_macro | NULL | NULL | |
| ast_descendants | table_macro | NULL | NULL | |
| ast_function_metrics | table_macro | NULL | NULL | |
| ast_function_scope | table_macro | NULL | NULL | |
| ast_functions_containing | table_macro | NULL | NULL | |
| ast_get_source | macro | NULL | NULL | |
| ast_get_source_numbered | macro | NULL | NULL | |
| ast_in_range | table_macro | NULL | NULL | |
| ast_match | table_macro | NULL | NULL | |
| ast_nesting_analysis | table_macro | NULL | NULL | |
| ast_pattern | table_macro | NULL | NULL | |
| ast_pattern_list | macro | NULL | NULL | |
| ast_peek_contains_any | scalar | NULL | NULL | |
| ast_security_audit | table_macro | NULL | NULL | |
| ast_siblings | table_macro | NULL | NULL | |
| ast_supported_languages | table | NULL | NULL | |
| clean_pattern | macro | NULL | NULL | |
| extract_wildcard_rules | macro | NULL | NULL | |
| get_kind | scalar | NULL | NULL | |
| get_line | macro | NULL | NULL | |
| get_lines_text | macro | NULL | NULL | |
| get_searchable_types | scalar | NULL | NULL | |
| get_super_kind | scalar | NULL | NULL | |
| get_variadic_names | macro | NULL | NULL | |
| has_body | scalar | NULL | NULL | |
| is_annotation | macro | NULL | NULL | |
| is_arithmetic | macro | NULL | NULL | |
| is_assignment | macro | NULL | NULL | |
| is_block | macro | NULL | NULL | |
| is_boolean_literal | macro | NULL | NULL | |
| is_call | scalar | NULL | NULL | |
| is_class_definition | macro | NULL | NULL | |
| is_comment | macro | NULL | NULL | |
| is_comparison | macro | NULL | NULL | |
| is_conditional | macro | NULL | NULL | |
| is_construct | scalar | NULL | NULL | |
| is_control_flow | scalar | NULL | NULL | |
| is_declaration_only | scalar | NULL | NULL | |
| is_definition | scalar | NULL | NULL | |
| is_directive | macro | NULL | NULL | |
| is_embodied | scalar | NULL | NULL | |
| is_export | macro | NULL | NULL | |
| is_foreign | macro | NULL | NULL | |
| is_function_call | macro | NULL | NULL | |
| is_function_definition | macro | NULL | NULL | |
| is_identifier | scalar | NULL | NULL | |
| is_import | macro | NULL | NULL | |
| is_jump | macro | NULL | NULL | |
| is_kind | scalar | NULL | NULL | |
| is_list | macro | NULL | NULL | |
| is_literal | macro | NULL | NULL | |
| is_logical | macro | NULL | NULL | |
| is_loop | macro | NULL | NULL | |
| is_member_access | macro | NULL | NULL | |
| is_module_definition | macro | NULL | NULL | |
| is_number_literal | macro | NULL | NULL | |
| is_parser_specific | scalar | NULL | NULL | |
| is_pattern_wildcard | macro | NULL | NULL | |
| is_punctuation | scalar | NULL | NULL | |
| is_semantic_type | scalar | NULL | NULL | |
| is_string_literal | macro | NULL | NULL | |
| is_syntax_only | scalar | NULL | NULL | |
| is_type_composite | macro | NULL | NULL | |
| is_type_definition | macro | NULL | NULL | |
| is_type_generic | macro | NULL | NULL | |
| is_type_primitive | macro | NULL | NULL | |
| is_type_reference | macro | NULL | NULL | |
| is_variable_definition | macro | NULL | NULL | |
| kind_code | scalar | NULL | NULL | |
| parse_ast | table | NULL | NULL | |
| parse_ast_flat | table | NULL | NULL | |
| parse_ast_hierarchical | table | NULL | NULL | |
| parse_html_wildcard | macro | NULL | NULL | |
| pattern_has_variadic | macro | NULL | NULL | |
| read_ast | table | NULL | NULL | |
| read_ast_flat | table | NULL | NULL | |
| read_ast_hierarchical | table | NULL | NULL | |
| read_ast_hierarchical_new | table | NULL | NULL | |
| read_lines | table_macro | NULL | NULL | |
| read_lines_context | table_macro | NULL | NULL | |
| read_lines_range | table_macro | NULL | NULL | |
| semantic_type_base | macro | NULL | NULL | |
| semantic_type_code | scalar | NULL | NULL | |
| semantic_type_to_string | scalar | NULL | NULL | |
| string_contains_any | scalar | NULL | NULL | |
| string_contains_any_i | scalar | NULL | NULL | |
| wildcard_capture_name | macro | NULL | NULL |