You're tired of AI agents writing code that 'just works' but fits like a square peg in a round hole - not your conventions, not your architecture, not your repo. Even with well-curated instructions. You correct the agent, it doesn't remember. Next session, same mistakes.
This MCP gives agents just enough context so they match how your team codes, know why, and remember every correction.
Here's what codebase-context does:
Finds the right context - Search that doesn't just return code. Each result comes back with analyzed and quantified coding patterns and conventions, related team memories, file relationships, and quality indicators. It knows whether you're looking for a specific file, a concept, or how things wire together - and filters out the noise (test files, configs, old utilities) before the agent sees them. The agent gets curated context, not raw hits.
Knows your conventions - Detected from your code and git history, not only from rules you wrote. Seeks team consensus and direction by adoption percentages and trends (rising/declining), golden files. Tells the difference between code that's common and code that's current - what patterns the team is moving toward and what's being left behind.
Remembers across sessions - Decisions, failures, workarounds that look wrong but exist for a reason - the battle scars that aren't in the comments. Recorded once, surfaced automatically so the agent doesn't "clean up" something you spent a week getting right. Conventional git commits (refactor:, migrate:, fix:) auto-extract into memory with zero effort. Stale memories decay and get flagged instead of blindly trusted.
Checks before editing - Before editing something, you get a decision card showing whether there's enough evidence to proceed. If a symbol has four callers and only two appear in your search results, the card shows that coverage gap. If coverage is low, whatWouldHelp lists the specific searches to run before you touch anything. When code, team memories, and patterns contradict each other, it tells you to look deeper instead of guessing.
One tool call returns all of it. Local-first - your code never leaves your machine.
Add it to the configuration of your AI Agent of preference:
claude mcp add codebase-context -- npx -y codebase-context /path/to/your/projectAdd to claude_desktop_config.json:
{
"mcpServers": {
"codebase-context": {
"command": "npx",
"args": ["-y", "codebase-context", "/path/to/your/project"]
}
}
}Add .vscode/mcp.json to your project root:
{
"servers": {
"codebase-context": {
"command": "npx",
"args": ["-y", "codebase-context", "/path/to/your/project"] // Or "${workspaceFolder}"if your workspace is one project only
}
}
}Add to .cursor/mcp.json in your project:
{
"mcpServers": {
"codebase-context": {
"command": "npx",
"args": ["-y", "codebase-context", "/path/to/your/project"]
}
}
}Open Settings > MCP and add:
{
"mcpServers": {
"codebase-context": {
"command": "npx",
"args": ["-y", "codebase-context", "/path/to/your/project"]
}
}
}Run codex mcp add codebase-context npx -y codebase-context "/path/to/your/project"
Other tools help AI find code. This one helps AI make the right decisions - by knowing what your team does, tracking where codebases are heading, and warning before mistakes happen.
| Without codebase-context | With codebase-context |
|---|---|
| Generates code using whatever matches or "sounds" right | Generates code following your team conventions |
| Copies any example that fits | Follows your best implementations (golden files) |
| Repeats mistakes you already corrected | Surfaces failure memories right before trying again |
| You re-explain the same things every session | Remembers conventions and decisions automatically |
| Edits confidently even when context is weak | Flags high-risk changes when evidence is thin |
| Sees what the current code does and assumes | Sees how your code has evolved and why |
This is where it all comes together. One call returns:
- Code results with
file(path + line range),summary,score - Type per result: compact
componentType:layer(e.g.,service:data) — helps agents orient - Pattern signals per result:
trend(Rising/Declining — Stable is omitted) andpatternWarningwhen using legacy code - Relationships per result:
importedByCountandhasTests(condensed) + hints (capped ranked callers, consumers, tests) — so you see suggested next reads and know what you haven't looked at yet - Related memories: up to 3 team decisions, gotchas, and failures matched to the query
- Search quality:
okorlow_confidencewith confidence score andhintwhen low - Preflight:
ready(boolean) with decision card whenintent="edit"|"refactor"|"migrate". ShowsnextAction(if not ready),warnings,patterns(do/avoid),bestExample,impact(caller coverage), andwhatWouldHelp(next steps). If search quality is low,readyis alwaysfalse.
Snippets are optional (includeSnippets: true). When enabled, snippets that have symbol metadata (e.g. from the Generic analyzer's AST chunking or Angular component chunks) start with a scope header so you know where the code lives (e.g. // AuthService.getToken() or // SpotifyApiService). Example:
// AuthService.getToken()
getToken(): string {
return this.token;
}Default output is lean — if the agent wants code, it calls read_file.
{
"searchQuality": { "status": "ok", "confidence": 0.72 },
"preflight": {
"ready": false,
"nextAction": "2 of 5 callers aren't in results — search for src/app.module.ts",
"patterns": {
"do": ["HttpInterceptorFn — 97%", "standalone components — 84%"],
"avoid": ["constructor injection — 3% (declining)"]
},
"bestExample": "src/auth/auth.interceptor.ts",
"impact": {
"coverage": "3/5 callers in results",
"files": ["src/app.module.ts", "src/boot.ts"]
},
"whatWouldHelp": [
"Search for src/app.module.ts to cover the main caller",
"Call get_team_patterns for auth/ injection patterns"
]
},
"results": [
{
"file": "src/auth/auth.interceptor.ts:1-20",
"summary": "HTTP interceptor that attaches auth token to outgoing requests",
"score": 0.72,
"type": "service:core",
"trend": "Rising",
"relationships": { "importedByCount": 4, "hasTests": true },
"hints": {
"callers": ["src/app.module.ts", "src/boot.ts"],
"tests": ["src/auth/auth.interceptor.spec.ts"]
}
}
],
"relatedMemories": ["Always use HttpInterceptorFn (0.97)"]
}Lean enough to fit on one screen. If search quality is low, preflight blocks edits instead of faking confidence.
Detects what your team actually does by analyzing the codebase:
- Adoption percentages for dependency injection, state management, testing, libraries
- Patterns/conventions trend direction (Rising / Stable / Declining) based on git recency
- Golden files - your best implementations ranked by modern pattern density
- Conflicts - when the team hasn't converged (both approaches above 20% adoption)
Record a decision once. It surfaces automatically in search results and preflight cards from then on. Your git commits also become memories - conventional commits like refactor:, migrate:, fix:, revert: from the last 90 days are auto-extracted during indexing.
- Types: conventions (style rules), decisions (architecture choices), gotchas (things that break), failures (we tried X, it broke because Y)
- Confidence decay: decisions age over 180 days, gotchas and failures over 90 days. Stale memories get flagged instead of blindly trusted.
- Zero-config git extraction: runs automatically during
refresh_index. No setup, no manual work.
| Tool | What it does |
|---|---|
search_codebase |
Hybrid search + decision card. Pass intent="edit" to get ready, nextAction, patterns, caller coverage, and whatWouldHelp. |
get_team_patterns |
Pattern frequencies, golden files, conflict detection |
get_symbol_references |
Find concrete references to a symbol (usageCount + top snippets). confidence: "syntactic" = static/source-based only; no runtime or dynamic dispatch. |
remember |
Record a convention, decision, gotcha, or failure |
get_memory |
Query team memory with confidence decay scoring |
get_codebase_metadata |
Project structure, frameworks, dependencies |
get_style_guide |
Style guide rules for the current project |
detect_circular_dependencies |
Import cycles between files |
refresh_index |
Re-index (full or incremental) + extract git memories |
get_indexing_status |
Progress and stats for the current index |
Reproducible evaluation with frozen fixtures so ranking/chunking changes are measured honestly and regressions get caught. For contributors and CI: run before releases or after changing search/ranking/chunking to guard against regressions.
- Two codebases:
npm run eval -- <codebaseA> <codebaseB> - Defaults: fixture A =
tests/fixtures/eval-angular-spotify.json, fixture B =tests/fixtures/eval-controlled.json - Offline smoke (no network):
npm run eval -- tests/fixtures/codebases/eval-controlled tests/fixtures/codebases/eval-controlled \
--fixture-a=tests/fixtures/eval-controlled.json \
--fixture-b=tests/fixtures/eval-controlled.json \
--skip-reindex --no-rerank- Flags:
--help,--fixture-a,--fixture-b,--skip-reindex,--no-rerank,--no-redact - To save a report for later comparison, redirect stdout (e.g.
pnpm run eval -- <path-to-angular-spotify> --skip-reindex > internal-docs/tests/eval-runs/angular-spotify-YYYY-MM-DD.txt).
The retrieval pipeline is designed around one goal: give the agent the right context, not just any file that matches.
- Definition-first ranking - for exact-name lookups (e.g. a symbol name), the file that defines the symbol ranks above files that only use it.
- Intent classification - knows whether "AuthService" is a name lookup or "how does auth work" is conceptual. Adjusts keyword/semantic weights accordingly.
- Hybrid fusion (RRF) - combines keyword and semantic search using Reciprocal Rank Fusion instead of brittle score averaging.
- Query expansion - conceptual queries automatically expand with domain-relevant terms (auth → login, token, session, guard).
- Contamination control - test files are filtered/demoted for non-test queries.
- Import centrality - files that are imported more often rank higher.
- Cross-encoder reranking - a stage-2 reranker triggers only when top scores are ambiguous. CPU-only, bounded to top-K.
- Incremental indexing - only re-indexes files that changed since last run (SHA-256 manifest diffing).
- Version gating - index artifacts are versioned; mismatches trigger automatic rebuild so mixed-version data is never served.
- Auto-heal - if the index corrupts, search triggers a full re-index automatically.
Index reliability: Rebuilds write to a staging directory and swap atomically only on success, so a failed rebuild never corrupts the active index. Version mismatches or corruption trigger an automatic full re-index (no user action required).
10 languages have full symbol extraction (Tree-sitter): TypeScript, JavaScript, Python, Java, Kotlin, C, C++, C#, Go, Rust. 30+ languages have indexing and retrieval coverage (keyword + semantic), including PHP, Ruby, Swift, Scala, Shell, and config/markup (JSON/YAML/TOML/XML, etc.).
Enrichment is framework-specific: right now only Angular has a dedicated analyzer for rich conventions/context (signals, standalone components, control flow, DI patterns).
For non-Angular projects, the Generic analyzer uses AST-aligned chunking when a Tree-sitter grammar is available: symbol-bounded chunks with scope-aware prefixes (e.g. // ClassName.methodName) so snippets show where code lives. Without a grammar it falls back to safe line-based chunking.
Structured filters available: framework, language, componentType, layer (presentation, business, data, state, core, shared).
| Variable | Default | Description |
|---|---|---|
EMBEDDING_PROVIDER |
transformers |
openai (fast, cloud) or transformers (local, private) |
OPENAI_API_KEY |
- | Required only if using openai provider |
CODEBASE_ROOT |
- | Project root (CLI arg takes precedence) |
CODEBASE_CONTEXT_DEBUG |
- | Set to 1 for verbose logging |
- First indexing: 2-5 minutes for ~30k files (embedding computation).
- Subsequent queries: milliseconds from cache.
- Incremental updates:
refresh_indexwithincrementalOnly: trueprocesses only changed files (SHA-256 manifest diffing).
.codebase-context/
memory.json # Team knowledge (should be persisted in git)
index-meta.json # Index metadata and version (generated)
intelligence.json # Pattern analysis (generated)
relationships.json # File/symbol relationships (generated)
index.json # Keyword index (generated)
index/ # Vector database (generated)
Recommended .gitignore:
# Codebase Context - ignore generated files, keep memory
.codebase-context/*
!.codebase-context/memory.jsonAll MCP tools are available as CLI commands — no AI agent required. Useful for scripting, debugging, and CI workflows.
Set CODEBASE_ROOT to your project root, or run from the project directory.
# Search the indexed codebase
npx codebase-context search --query "authentication middleware"
npx codebase-context search --query "auth" --intent edit --limit 5
# Project structure, frameworks, and dependencies
npx codebase-context metadata
# Index state and progress
npx codebase-context status
# Re-index the codebase
npx codebase-context reindex
npx codebase-context reindex --incremental --reason "added new service"
# Style guide rules
npx codebase-context style-guide
npx codebase-context style-guide --query "naming" --category patterns
# Team patterns (DI, state, testing, etc.)
npx codebase-context patterns
npx codebase-context patterns --category testing
# Symbol references
npx codebase-context refs --symbol "UserService"
npx codebase-context refs --symbol "handleLogin" --limit 20
# Circular dependency detection
npx codebase-context cycles
npx codebase-context cycles --scope src/features
# Memory management
npx codebase-context memory list
npx codebase-context memory list --category conventions --type convention
npx codebase-context memory list --query "auth" --json
npx codebase-context memory add --type convention --category tooling --memory "Use pnpm, not npm" --reason "Workspace support and speed"
npx codebase-context memory remove <id>All commands accept --json for raw JSON output suitable for piping and scripting.
Add this to .cursorrules, CLAUDE.md, or AGENTS.md:
## Codebase Context
**At start of each task:** Call `get_memory` to load team conventions.
**When user says "remember this" or "record this":**
- Call `remember` tool IMMEDIATELY before doing anything else.
- Motivation - Research and design rationale
- Changelog - Version history
- Contributing - How to add analyzers
MIT