Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -14,3 +14,4 @@ dist/
.vscode/
.trae/documents/
.trae/
.cursor/
17 changes: 8 additions & 9 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,25 +5,25 @@
**Branch:** refactor/cli-commands-architecture

## OVERVIEW
git-ai CLI + MCP server. TypeScript implementation for AI-powered Git operations with semantic search, DSR (Deterministic Semantic Record), and graph-based code analysis. Indices stored in `.git-ai/`.
git-ai CLI + MCP server. TypeScript implementation for AI-powered Git operations with semantic search and graph-based code analysis. Indices stored in `.git-ai/`.

## STRUCTURE
```
git-ai-cli-v2/
├── src/
│ ├── cli/ # CLI command architecture (NEW: registry + handlers + schemas)
│ │ ├── types.ts # Core types, executeHandler
│ │ ├── registry.ts # Handler registry (24 commands)
│ │ ├── registry.ts # Handler registry (20 commands)
│ │ ├── helpers.ts # Shared utilities
│ │ ├── schemas/ # Zod validation schemas
│ │ ├── handlers/ # Business logic handlers
│ │ └── commands/ # Commander.js wrappers
│ ├── commands/ # Command aggregator (ai.ts only)
│ ├── core/ # Indexing, DSR, graph, storage, parsers
│ ├── core/ # Indexing, graph, storage, parsers
│ └── mcp/ # MCP server implementation
├── test/ # Node test runner tests
├── dist/ # Build output
└── .git-ai/ # Indices (LanceDB + DSR)
└── .git-ai/ # Indices (LanceDB)
```

## WHERE TO LOOK
Expand All @@ -32,12 +32,12 @@ git-ai-cli-v2/
| CLI commands | `src/cli/commands/*.ts` (new architecture) |
| CLI handlers | `src/cli/handlers/*.ts` (business logic) |
| CLI schemas | `src/cli/schemas/*.ts` (Zod validation) |
| Handler registry | `src/cli/registry.ts` (all 24 commands) |
| Handler registry | `src/cli/registry.ts` (all 20 commands) |
| Command aggregator | `src/commands/ai.ts` (entry point) |
| Indexing logic | `src/core/indexer.ts`, `src/core/indexerIncremental.ts` |
| DSR (commit records) | `src/core/dsr/`, `src/core/dsr.ts` |
| Graph queries | `src/core/cozo.ts`, `src/core/astGraph.ts` |
| Semantic search | `src/core/semantic.ts`, `src/core/sq8.ts` |
| Repo map | `src/core/repoMap.ts` |
| MCP tools | `src/mcp/`, `src/core/graph.ts` |
| Language parsers | `src/core/parser/*.ts` |

Expand All @@ -47,9 +47,9 @@ git-ai-cli-v2/
| `indexer` | fn | `core/indexer.ts` | Full repository indexing |
| `incrementalIndexer` | fn | `core/indexerIncremental.ts` | Incremental updates |
| `GitAiService` | class | `mcp/index.ts` | MCP entry point |
| `runDsr` | fn | `commands/dsr.ts` | DSR CLI command |
| `cozoQuery` | fn | `core/cozo.ts` | Graph DB queries |
| `semanticSearch` | fn | `core/semantic.ts` | Vector similarity |
| `repoMap` | fn | `core/repoMap.ts` | PageRank-based repo overview |
| `resolveGitRoot` | fn | `core/git.ts` | Repo boundary detection |

## CONVENTIONS
Expand All @@ -69,8 +69,8 @@ git-ai-cli-v2/
## UNIQUE STYLES
- `.git-ai/` directory for all index data (not config files)
- MCP tools require explicit `path` argument
- DSR files per commit for reproducible queries
- Multi-language parser architecture (TS, Go, Rust, Python, C, Markdown, YAML)
- PageRank-based repo-map for code importance scoring

## COMMANDS
```bash
Expand All @@ -84,5 +84,4 @@ node dist/bin/git-ai.js --help # Validate packaged output
## NOTES
- Indices auto-update on git operations
- `checkIndex` gates symbol/semantic/graph queries
- DSR commit hash mismatch with HEAD triggers warning
- MCP server exposes git-ai tools for external IDEs
80 changes: 32 additions & 48 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,12 +39,12 @@ npm install -g @mars167/git-ai

**Code semantics should be versioned and traceable, just like code itself**

git-ai is a local code understanding tool that builds a traceable semantic layer for your codebase using DSR (Deterministic Semantic Record) and Hyper RAG, enabling AI Agents and developers to truly understand code evolution and relationships.
git-ai is a local code understanding tool that builds a semantic layer for your codebase using advanced RAG techniques, enabling AI Agents and developers to deeply understand code structure and relationships.

### ✨ Why git-ai?

- **🔗 Hyper RAG**: Combines vector retrieval + graph retrieval + DSR for multi-dimensional semantic understanding
- **📜 Versioned Semantics**: Every commit has a semantic snapshot, historical changes are clear and traceable
- **🔗 Advanced RAG**: Combines vector retrieval + graph retrieval for multi-dimensional semantic understanding
- **📊 Fast & Accurate**: Optimized repo-map with PageRank-based importance scoring
- **🔄 Always Available**: Indices travel with code, available immediately after checkout, no rebuild needed
- **🤖 AI-Native**: MCP Server enables Claude, Trae and other Agents to deeply understand your codebase
- **🔒 Fully Local**: Code never leaves your machine, secure and private
Expand Down Expand Up @@ -80,19 +80,7 @@ git-ai ai graph callees authenticateUser
git-ai ai graph chain authenticateUser --max-depth 3
```

### 3️⃣ Historical Change Tracing

Track symbol evolution through DSR:

```bash
# View function's historical changes
git-ai ai dsr query symbol-evolution authenticateUser --limit 50

# View complete semantic snapshot for a commit
git-ai ai dsr context
```

### 4️⃣ Multi-Language Support
### 3️⃣ Multi-Language Support

Supports multiple mainstream programming languages:

Expand All @@ -112,18 +100,14 @@ Supports multiple mainstream programming languages:

## 💡 Design Philosophy

git-ai is not just a search tool, but a "semantic timeline" for your codebase:

### DSR (Deterministic Semantic Record)

Each commit corresponds to an immutable semantic snapshot, recording the code structure, symbol relationships, and design intent at that time. Code semantics should be versioned—just like code itself—traceable, comparable, and evolvable.
git-ai is built for deep code understanding through multiple retrieval strategies:

### Hyper RAG
### Advanced RAG

Combines multiple retrieval methods for deeper understanding:
- **Vector Retrieval**: Semantic similarity matching
- **Graph Retrieval**: Call relationship, inheritance analysis
- **DSR Retrieval**: Historical evolution tracing
- **Vector Retrieval**: Semantic similarity matching using SQ8 quantized embeddings
- **Graph Retrieval**: Call relationship and dependency analysis via AST graphs
- **Intelligent Fusion**: Weighted combination of retrieval strategies for optimal results

### Decentralized Semantics

Expand Down Expand Up @@ -161,10 +145,10 @@ git-ai ai graph chain processOrder --max-depth 5
# Find all callers
git-ai ai graph callers deprecatedFunction

# Trace historical changes, understand design intent
git-ai ai dsr query symbol-evolution deprecatedFunction --all
# Analyze complete call chain
git-ai ai graph chain deprecatedFunction --direction upstream
```
*DSR traces historical changes, understanding design intent*
*Graph analysis reveals complete impact scope*

### Scenario 3: Bug Localization and Root Cause Analysis

Expand Down Expand Up @@ -195,30 +179,30 @@ Claude will automatically invoke git-ai tools to provide deep analysis. *Enablin

```mermaid
graph TB
A[Git Repository] -->|On Commit| B[DSR\nDeterministic Semantic Record]
B --> C[.git-ai/dsr/commit.json\nSemantic Snapshot]
C -->|Index Rebuild| D[LanceDB\nVector Database]
C -->|Index Rebuild| E[CozoDB\nGraph Database]
D --> F[MCP Server]
E --> F
F -->|Tool Call| G[AI Agent\nClaude Desktop / Trae]
F -->|CLI| H[Developer]
C -->|Cross-Version| I[Semantic Timeline\nTraceable · Comparable · Evolvable]
A[Git Repository] -->|Index| B[Code Parser\nMulti-Language AST]
B --> C[LanceDB\nVector Database]
B --> D[CozoDB\nGraph Database]
C --> E[MCP Server]
D --> E
E -->|Tool Call| F[AI Agent\nClaude Desktop / Cursor]
E -->|CLI| G[Developer]
B -->|Repo Map| H[PageRank Analysis\nImportance Scoring]
H --> E
style B fill:#e1f5ff,stroke:#333
style C fill:#e8f5e9,stroke:#333
style C fill:#fff4e1,stroke:#333
style D fill:#fff4e1,stroke:#333
style E fill:#fff4e1,stroke:#333
style F fill:#e8f5e9,stroke:#333
style G fill:#f3e5f5,stroke:#333
style I fill:#fce4ec,stroke:#333
style E fill:#e8f5e9,stroke:#333
style F fill:#f3e5f5,stroke:#333
style H fill:#fce4ec,stroke:#333
```

**Core Components**:

- **DSR (Deterministic Semantic Record)**: Immutable semantic snapshots stored per commit, versioned semantics
- **LanceDB + SQ8**: High-performance vector database, supporting semantic search
- **CozoDB**: Graph database, supporting AST-level relationship queries
- **MCP Server**: Standard protocol interface, for AI Agent invocation
- **Code Parser**: Multi-language AST extraction (TypeScript, Java, Python, Go, Rust, C, Markdown, YAML)
- **LanceDB + SQ8**: High-performance vector database with quantized embeddings for semantic search
- **CozoDB**: Graph database for AST-level relationship queries (callers, callees, chains)
- **Repo Map**: PageRank-based code importance analysis for project overview
- **MCP Server**: Standard protocol interface for AI Agent invocation

---

Expand All @@ -228,12 +212,12 @@ graph TB
|---------|--------|-------------------|-------------|
| Local Execution | ✅ | ❌ | ❌ |
| AST-Level Analysis | ✅ | ❌ | ✅ |
| Versioned Semantics | ✅ | ❌ | ❌ |
| Historical Change Tracing | ✅ | ❌ | ❌ |
| AI Agent Integration | ✅ | ❌ | ❌ |
| Free & Open Source | ✅ | ❌ | ❌ |
| Semantic Search | ✅ | ✅ | ✅ |
| Call Chain Analysis | ✅ | ❌ | ✅ |
| Multi-Language Support | ✅ | ✅ | ✅ |
| Repo Map with PageRank | ✅ | ❌ | ❌ |

---

Expand Down
4 changes: 1 addition & 3 deletions skills/git-ai-code-search/SKILL.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
name: git-ai-code-search
description: |
Semantic code search and codebase understanding using git-ai MCP tools. Use when: (1) Searching for symbols, functions, or semantic concepts, (2) Understanding project architecture, (3) Analyzing call graphs and code relationships, (4) Tracking symbol history via DSR. Triggers: "find X", "search for X", "who calls X", "where is X", "history of X", "understand this codebase".
Semantic code search and codebase understanding using git-ai MCP tools. Use when: (1) Searching for symbols, functions, or semantic concepts, (2) Understanding project architecture, (3) Analyzing call graphs and code relationships. Triggers: "find X", "search for X", "who calls X", "where is X", "understand this codebase".
---

# git-ai Code Search
Expand Down Expand Up @@ -33,15 +33,13 @@ git-ai ai semantic "authentication logic" # search
| Who calls X | `ast_graph_callers` | `{ path, name: "processOrder" }` |
| What X calls | `ast_graph_callees` | `{ path, name: "processOrder" }` |
| Call chain | `ast_graph_chain` | `{ path, name: "main", direction: "downstream" }` |
| Symbol history | `dsr_symbol_evolution` | `{ path, symbol: "UserService" }` |
| Project overview | `repo_map` | `{ path, max_files: 20 }` |

## Rules

1. **Always pass `path`** - Every tool requires explicit repository path
2. **Check index first** - Run `check_index` before search tools
3. **Read before modify** - Use `read_file` to understand code before changes
4. **Use DSR for history** - Never parse git log manually

## References

Expand Down
31 changes: 0 additions & 31 deletions skills/git-ai-code-search/references/constraints.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,12 +50,6 @@ search_symbols({ path: "/repo", query: "..." })

**Why:** Prevents breaking changes and ensures informed modifications.

### 4. use_dsr_for_history

**Rule:** When tracing symbol history, MUST use `dsr_symbol_evolution`. NEVER manually parse git log or diff.

**Why:** DSR provides structured, semantic change information that raw git commands don't.

## Warning-Level Rules

### 5. repo_map_before_large_change
Expand All @@ -64,15 +58,6 @@ search_symbols({ path: "/repo", query: "..." })

**Why:** Provides context for planning changes and identifying affected areas.

### 6. respect_dsr_risk

**Rule:** When DSR reports `risk_level: high`, exercise extra caution. Operations like `delete` and `rename` require additional review.

**Risk levels:**
- `low`: Safe, routine changes
- `medium`: Review recommended
- `high`: Extra scrutiny required

## Recommended Practices

### prefer_semantic_search
Expand Down Expand Up @@ -103,26 +88,13 @@ ast_graph_callers({ path: "/repo", name: result.name })
// etc.
```

### incremental_dsr_generation

Generate DSR on-demand for specific commits rather than batch-generating for entire history.

```js
// Good: Generate for specific commit when needed
dsr_generate({ path: "/repo", commit: "abc123" })

// Avoid: Generating for all historical commits upfront
```

## Prohibited Actions

| Action | Reason |
|--------|--------|
| Assume symbol location without searching | Always confirm via search |
| Modify unread files | Must read and understand first |
| Manual git log parsing for history | Use DSR tools instead |
| Search with missing index | Rebuild index first |
| Ignore high risk DSR warnings | Requires extra review |
| Omit `path` parameter | Every call must be explicit |

## Tool-Specific Constraints
Expand All @@ -137,7 +109,4 @@ dsr_generate({ path: "/repo", commit: "abc123" })
| `ast_graph_callers` | `check_index` passed | `path`, `name` |
| `ast_graph_callees` | `check_index` passed | `path`, `name` |
| `ast_graph_chain` | `check_index` passed | `path`, `name` |
| `dsr_context` | None | `path` |
| `dsr_generate` | None | `path`, `commit` |
| `dsr_symbol_evolution` | DSR exists for commits | `path`, `symbol` |
| `read_file` | None | `path`, `file` |
55 changes: 0 additions & 55 deletions skills/git-ai-code-search/references/tools.md
Original file line number Diff line number Diff line change
Expand Up @@ -178,61 +178,6 @@ ast_graph_refs({
})
```

## DSR Tools (Deterministic Semantic Records)

### dsr_context

Get repository Git context and DSR directory state.

```js
dsr_context({ path: "/repo" })
```

**Returns:** Branch info, commit status, DSR availability.

### dsr_generate

Generate DSR for a specific commit.

```js
dsr_generate({
path: "/repo",
commit: "HEAD" // or specific commit hash
})
```

**When to use:** Before querying history for commits without DSR.

### dsr_symbol_evolution

Track how a symbol changed over time.

```js
dsr_symbol_evolution({
path: "/repo",
symbol: "authenticateUser",
limit: 50,
contains: false, // true for substring match
all: false // true to traverse all refs, not just HEAD
})
```

**Returns:** List of changes with:
- `commit`: Commit hash
- `operation`: add | modify | delete | rename
- `risk_level`: low | medium | high
- `details`: Change description

**When to use:** Understanding design evolution, finding when/why something changed.

### dsr_rebuild_index

Rebuild DSR index from DSR files.

```js
dsr_rebuild_index({ path: "/repo" })
```

## File Operations

### read_file
Expand Down
Loading