Parse and analyze source code ASTs from 20+ programming languages with tree-sitter grammars
Maintainer(s):
teaguesterling
Installing and Loading
INSTALL sitting_duck FROM community;
LOAD sitting_duck;
Example
-- Parse Python code and extract function definitions
SELECT name, type, depth
FROM parse_ast('
def hello():
return "hello world"
', 'python')
WHERE type = 'function_definition';
-- Parse JavaScript from a file
SELECT type, name, start_row, end_row
FROM read_ast('src/**/*.js')
WHERE type IN ('function_declaration', 'class_declaration');
-- Analyze code complexity by counting AST nodes
SELECT file_path, COUNT(*) as node_count
FROM read_ast('src/**/*.py')
GROUP BY file_path
ORDER BY node_count DESC;
-- Find all function calls in TypeScript
SELECT name, file_path, start_row
FROM read_ast('app.ts', language := 'typescript')
WHERE type = 'call_expression';
-- Cross-language analysis
SELECT language, COUNT(*) as definitions
FROM read_ast(['src/**/*.py', 'src/**/*.js', 'src/**/*.ts', 'src/**/*.go'])
WHERE type LIKE '%definition%' OR type LIKE '%declaration%'
GROUP BY language;
About sitting_duck
Sitting Duck is a DuckDB extension for parsing source code into Abstract Syntax Trees (ASTs) using tree-sitter grammars. It provides a SQL interface to analyze code structure across 20+ programming languages.
Core Table Functions:
read_ast(file_pattern, language := NULL)- Parse source files into AST rowsparse_ast(content, language)- Parse source code strings into AST rows
Supported Languages (20):
- Systems: C, C++, Rust, Go
- Web: JavaScript, TypeScript, HTML, CSS, JSON
- Backend: Python, Ruby, PHP, Java, Kotlin, C#
- Mobile: Swift, Dart
- Scripting: Bash, Lua
- Data: SQL, R, Markdown
AST Schema Fields:
type- Node type (e.g., function_definition, class_declaration)name- Extracted identifier name when applicablefile_path- Source file pathlanguage- Detected or specified languagestart_row,start_column- Node start positionend_row,end_column- Node end positiondepth- Nesting depth in the ASTparent_type- Type of parent nodedescendant_count- Number of descendant nodes
Key Features:
- Automatic language detection from file extensions
- Glob pattern support for multi-file analysis
- Semantic type normalization across languages
- Efficient tree-sitter parsing with pre-generated grammars
- DuckDB native SQL integration for complex queries
Perfect for code analysis, refactoring tools, documentation generation, codebase exploration, and AI-assisted development workflows.
Added Functions
| function_name | function_type | description | comment | examples |
|---|---|---|---|---|
| ast_supported_languages | table | NULL | NULL | |
| get_kind | scalar | NULL | NULL | |
| get_searchable_types | scalar | NULL | NULL | |
| get_super_kind | scalar | NULL | NULL | |
| is_call | scalar | NULL | NULL | |
| is_control_flow | scalar | NULL | NULL | |
| is_definition | scalar | NULL | NULL | |
| is_identifier | scalar | NULL | NULL | |
| is_kind | scalar | NULL | NULL | |
| is_semantic_type | scalar | NULL | NULL | |
| kind_code | scalar | NULL | NULL | |
| parse_ast | table | NULL | NULL | |
| parse_ast_flat | table | NULL | NULL | |
| parse_ast_hierarchical | table | NULL | NULL | |
| read_ast | table | NULL | NULL | |
| read_ast_flat | table | NULL | NULL | |
| read_ast_hierarchical | table | NULL | NULL | |
| read_ast_hierarchical_new | table | NULL | NULL | |
| semantic_type_code | scalar | NULL | NULL | |
| semantic_type_to_string | scalar | NULL | NULL |