Search Shortcut cmd + k | ctrl + k
sitting_duck

Parse and analyze source code ASTs from 20+ programming languages with tree-sitter grammars

Maintainer(s): teaguesterling

Installing and Loading

INSTALL sitting_duck FROM community;
LOAD sitting_duck;

Example

-- Parse Python code and extract function definitions
SELECT name, type, depth
FROM parse_ast('
def hello():
    return "hello world"
', 'python')
WHERE type = 'function_definition';

-- Parse JavaScript from a file
SELECT type, name, start_row, end_row
FROM read_ast('src/**/*.js')
WHERE type IN ('function_declaration', 'class_declaration');

-- Analyze code complexity by counting AST nodes
SELECT file_path, COUNT(*) as node_count
FROM read_ast('src/**/*.py')
GROUP BY file_path
ORDER BY node_count DESC;

-- Find all function calls in TypeScript
SELECT name, file_path, start_row
FROM read_ast('app.ts', language := 'typescript')
WHERE type = 'call_expression';

-- Cross-language analysis
SELECT language, COUNT(*) as definitions
FROM read_ast(['src/**/*.py', 'src/**/*.js', 'src/**/*.ts', 'src/**/*.go'])
WHERE type LIKE '%definition%' OR type LIKE '%declaration%'
GROUP BY language;

About sitting_duck

Sitting Duck is a DuckDB extension for parsing source code into Abstract Syntax Trees (ASTs) using tree-sitter grammars. It provides a SQL interface to analyze code structure across 20+ programming languages.

Core Table Functions:

  • read_ast(file_pattern, language := NULL) - Parse source files into AST rows
  • parse_ast(content, language) - Parse source code strings into AST rows

All pasring is highly flexible via parameters (see: https://github.com/teaguesterling/sitting_duck/blob/main/docs/api/parameters.md)

Supported Languages (27): | Category | Languages | |———-|———–| | Web | JavaScript, TypeScript, HTML, CSS | | Systems | C, C++, Go, Rust, Zig | | Scripting | Python, Ruby, PHP, Lua, R, Bash | | Enterprise | Java, C#, Kotlin, Swift | | Mobile | Dart | | Infrastructure | HCL (Terraform), JSON, TOML, GraphQL | | Documentation | SQL, Markdown |

AST Schema Fields: See: https://github.com/teaguesterling/sitting_duck/blob/main/docs/api/output-schema.md

  • type - Node type (e.g., function_definition, class_declaration)
  • name - Extracted identifier name when applicable
  • file_path - Source file path
  • language - Detected or specified language
  • start_line, start_column - Node start position
  • end_end, end_column - Node end position
  • depth - Nesting depth in the AST
  • parent_type - Type of parent node
  • descendant_count - Number of descendant nodes
  • peek - Customizable preview of node source
  • semantic_type - Normalized node type across all langauges
  • qualified_name - Extracted name and available scope qualifiers
  • signature_type - Extracted type string for variables, expressions, and functions as available
  • parameters - Extraacted function parameters, arguments, parent clases, etc. as available
  • modifiers - Class, variable, parameter modifiers as available
  • annotations - Decorators and annotations as avaialble

See https://github.com/teaguesterling/sitting_duck/blob/main/docs/native_extraction_semantics.md for additional details on "native" extraction for each langauge. Note: this is an active area of development.

Key Features:

  • Automatic language detection from file extensions
  • Glob pattern support for multi-file analysis
  • Semantic type normalization across languages (https://github.com/teaguesterling/sitting_duck/blob/main/docs/api/semantic-types.md)
  • Native, locally aware annotation of key node types for easy handling (https://github.com/teaguesterling/sitting_duck/blob/main/docs/native_extraction_semantics.md)
  • Efficient tree-sitter parsing with pre-generated grammars
  • DuckDB native SQL integration for complex queries

Perfect for code analysis, refactoring tools, documentation generation, codebase exploration, and AI-assisted development workflows.

Added Functions

function_name function_type description comment examples
ast_get_source macro NULL NULL  
ast_get_source_numbered macro NULL NULL  
ast_supported_languages table NULL NULL  
get_kind scalar NULL NULL  
get_line macro NULL NULL  
get_lines_text macro NULL NULL  
get_searchable_types scalar NULL NULL  
get_super_kind scalar NULL NULL  
is_arithmetic macro NULL NULL  
is_assignment macro NULL NULL  
is_block macro NULL NULL  
is_boolean_literal macro NULL NULL  
is_call scalar NULL NULL  
is_class_definition macro NULL NULL  
is_comparison macro NULL NULL  
is_conditional macro NULL NULL  
is_control_flow scalar NULL NULL  
is_definition scalar NULL NULL  
is_function_call macro NULL NULL  
is_function_definition macro NULL NULL  
is_identifier scalar NULL NULL  
is_jump macro NULL NULL  
is_kind scalar NULL NULL  
is_list macro NULL NULL  
is_literal macro NULL NULL  
is_logical macro NULL NULL  
is_loop macro NULL NULL  
is_member_access macro NULL NULL  
is_module_definition macro NULL NULL  
is_number_literal macro NULL NULL  
is_semantic_type scalar NULL NULL  
is_string_literal macro NULL NULL  
is_type_definition macro NULL NULL  
is_variable_definition macro NULL NULL  
kind_code scalar NULL NULL  
parse_ast table NULL NULL  
parse_ast_flat table NULL NULL  
parse_ast_hierarchical table NULL NULL  
read_ast table NULL NULL  
read_ast_flat table NULL NULL  
read_ast_hierarchical table NULL NULL  
read_ast_hierarchical_new table NULL NULL  
read_lines table_macro NULL NULL  
read_lines_context table_macro NULL NULL  
read_lines_range table_macro NULL NULL  
semantic_type_code scalar NULL NULL  
semantic_type_to_string scalar NULL NULL