ASTral/
├── ASTral.sln
├── global.json # .NET SDK version (10.0)
├── README.md
├── docs/
│ ├── ARCHITECTURE.md
│ ├── LANGUAGE_SUPPORT.md
│ ├── SECURITY.md
│ ├── SPEC.md
│ ├── TOKEN_SAVINGS.md
│ └── USER_GUIDE.md
│
├── src/ASTral/
│ ├── ASTral.csproj
│ ├── Program.cs # Entry point: DI setup + MCP server bootstrap
│ │
│ ├── Configuration/
│ │ └── AstralConfig.cs # .astralrc JSON config file loading
│ │
│ ├── Models/
│ │ ├── CodeIndex.cs # Repository index: search, scoring, pattern matching
│ │ ├── Symbol.cs # Sealed record: ID generation, content hashing, JSON attributes
│ │ └── SymbolNode.cs # Hierarchical tree building for outlines
│ │
│ ├── Parser/
│ │ ├── LanguageRegistry.cs # LanguageSpec registry for 16 languages
│ │ └── SymbolExtractor.cs # tree-sitter AST walking + symbol extraction
│ │
│ ├── Security/
│ │ └── SecurityValidator.cs # Path traversal, symlink, secret, binary detection
│ │
│ ├── Services/
│ │ └── FileWatcherService.cs # Background file watcher for auto re-indexing
│ │
│ ├── Storage/
│ │ ├── IndexStore.cs # Save/load indexes, incremental indexing, byte-offset retrieval
│ │ └── TokenTracker.cs # Persistent token savings counter (~/.code-index/_savings.json)
│ │
│ ├── Summarizer/
│ │ ├── BatchSummarizer.cs # Docstring > AI > signature fallback
│ │ └── FileSummarizer.cs # File-level heuristic summaries
│ │
│ ├── Utils/
│ │ └── GlobMatcher.cs # Glob/wildcard pattern matching (* and ? support)
│ │
│ └── Tools/
│ ├── ToolUtils.cs # Shared tool helpers (repo resolution, meta builder)
│ ├── IndexRepoTool.cs # GitHub repository indexing
│ ├── IndexFolderTool.cs # Local folder indexing
│ ├── ListReposTool.cs
│ ├── GetFileTreeTool.cs
│ ├── GetFileOutlineTool.cs
│ ├── GetSymbolTool.cs
│ ├── GetSymbolsTool.cs
│ ├── SearchSymbolsTool.cs
│ ├── SearchTextTool.cs
│ ├── GetRepoOutlineTool.cs
│ └── InvalidateCacheTool.cs
│
└── tests/ASTral.Tests/
├── ASTral.Tests.csproj
├── AstralConfigTests.cs # Config file loading, env var overrides
├── CodeIndexTests.cs # Search algorithm, symbol retrieval
├── IndexStoreTests.cs # Storage, incremental indexing, versioning
├── LanguageRegistryTests.cs # Language spec validation
├── SecurityValidatorTests.cs # Path validation, secret detection
├── SymbolTests.cs # Symbol ID generation, hashing, equality
├── TokenTrackerTests.cs # Token tracking, cost calculations
└── ToolIntegrationTests.cs # End-to-end tool workflow tests
Source code (GitHub API or local folder)
|
v
Security filters (path traversal, symlinks, secrets, binary, size)
|
v
tree-sitter parsing (language-specific grammars via LanguageSpec)
|
v
Symbol extraction (functions, classes, methods, constants, types)
|
v
Post-processing (overload disambiguation, content hashing)
|
v
Summarization (docstring > AI batch > signature fallback)
|
v
Storage (JSON index + raw files, atomic writes)
|
v
MCP tools (discovery, search, retrieval)
The parser follows a language registry pattern. Each of the 16 supported languages defines a LanguageSpec describing how symbols are extracted from its AST.
public record LanguageSpec(
string TsLanguage,
Dictionary<string, string> SymbolNodeTypes,
Dictionary<string, string> NameFields,
Dictionary<string, string> ParamFields,
Dictionary<string, string> ReturnTypeFields,
string DocstringStrategy,
string? DecoratorNodeType,
List<string> ContainerNodeTypes,
List<string> ConstantPatterns,
List<string> TypePatterns
);The generic extractor performs two post-processing passes:
-
Overload disambiguation Duplicate symbol IDs receive numeric suffixes (
~1,~2, etc.) -
Content hashing SHA-256 hashes of symbol source content enable change detection.
{file_path}::{qualified_name}#{kind}
Examples:
src/main.py::UserService.login#methodsrc/utils.py::authenticate#functionconfig.py::MAX_RETRIES#constant
IDs remain stable across re-indexing as long as the file path, qualified name, and symbol kind remain unchanged.
Indexes are stored at ~/.code-index/ (configurable via CODE_INDEX_PATH):
{owner}-{name}.json— metadata, file hashes, symbol metadata, file summaries{owner}-{name}/— cached raw source files
Each symbol records byte offsets, allowing O(1) retrieval via seek() + read() without re-parsing.
Incremental indexing compares stored file hashes with current hashes, reprocessing only changed files. Writes are atomic (temporary file + rename).
All file operations pass through SecurityValidator:
- Path traversal protection via validated resolved paths
- Symlink target validation
- Secret-file exclusion using predefined patterns
- Binary file detection (extension-based + null-byte content sniffing)
- Safe encoding reads using replacement characters
All tool responses include metadata:
{
"result": "...",
"_meta": {
"timing_ms": 42,
"repo": "owner/repo",
"symbol_count": 387,
"truncated": false,
"tokens_saved": 2450,
"total_tokens_saved": 184320,
"cost_avoided": { "claude_opus": 0.0368, "gpt5_latest": 0.0245 },
"total_cost_avoided": { "claude_opus": 2.76, "gpt5_latest": 1.84 }
}
}tokens_saved and total_tokens_saved are included on all retrieval and search tools. The running total is persisted to ~/.code-index/_savings.json across sessions.
search_symbols uses weighted scoring:
| Match type | Weight |
|---|---|
| Exact name match | +20 |
| Name substring | +10 |
| Name word overlap | +5 per word |
| Signature match | +8 (full) / +2 (word) |
| Summary match | +5 (full) / +1 (word) |
| Docstring/keyword match | +3 / +1 per word |
Filters (kind, language, file_pattern) are applied before scoring. Results scoring zero are excluded.
ASTral uses Microsoft.Extensions.Hosting for dependency injection and server lifecycle:
builder.Services.AddSingleton<IndexStore>();
builder.Services.AddSingleton<TokenTracker>();
builder.Services.AddSingleton<SymbolExtractor>();
builder.Services.AddSingleton<BatchSummarizer>();MCP tools are auto-discovered from the assembly via [McpServerTool] attributes and receive dependencies as method parameters.
| Package | Purpose |
|---|---|
ModelContextProtocol |
MCP server framework |
Microsoft.Extensions.Hosting |
Dependency injection and hosting |
TreeSitter.DotNet |
tree-sitter AST parsing |
MAB.DotIgnore |
.gitignore pattern matching |
Anthropic |
AI summarization via Claude Haiku |
xunit.v3 |
Unit testing (test project) |
Microsoft.NET.Test.Sdk |
Test infrastructure (test project) |