Skip to content

Commit 7cf4de5

Browse files
committed
Built a mature search and retrieval soln. Enabled tool-based usage via MCP server. Introduced AI models - Devstral 2 and Gemini for natural language interactions, VoyageAI for embedding and reranking.
1 parent 374cfb5 commit 7cf4de5

25 files changed

Lines changed: 2621 additions & 93 deletions

.gitignore

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -208,5 +208,7 @@ __marimo__/
208208
knowcode_knowledge.json
209209
CHANGELOG.md
210210
docs_test/
211-
KnowCode.md
211+
# aimodels.yaml
212+
# knowcode.yaml
213+
# .knowcode/
212214

KnowCode.md

Lines changed: 161 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -178,7 +178,8 @@ Enable **retrieval-augmented generation (RAG)** by indexing code semantics in a
178178
* **Embedding**: Generate dense vector representations (e.g., OpenAI text-embedding-3-small)
179179
* **Vector Storage**: Persist vectors for fast nearest-neighbor search
180180
* **Hybrid Retrieval**: Combine dense (vector) and sparse (BM25) search results
181-
* **Reranking**: Optimize results based on metadata, recency, and completeness
181+
* **Reranking**: Upgrade to **Cross-Encoder** (e.g., ms-marco-MiniLM) for high-precision relevance scoring vs. simple cosine similarity.
182+
* **Graph-Enhanced Query Expansion**: usage of the Semantic Graph to expand search terms (e.g. synonyms, child classes, interfaces)
182183
* **[HARDENED]** Sliding window chunking with overlap
183184
* **[HARDENED]** Real-time incremental indexing (Watch Mode)
184185
* **[HARDENED]** Dependency-aware result expansion (Completeness)
@@ -548,7 +549,112 @@ Use frontier LLMs **only where they add leverage**, not as a crutch.
548549
549550
---
550551
551-
## **11\. Feedback, Validation & Evolution Layer**
552+
---
553+
554+
## **10a. [NEW] Agent & Configuration Layer**
555+
556+
### **Purpose**
557+
558+
Provide a robust, configurable interface to external LLMs with failover, rate limiting, and multi-provider support.
559+
560+
### **Responsibilities**
561+
562+
* **Configuration**: Load model priorities and settings from `aimodels.yaml` or `knowcode.yaml`.
563+
* **Model Selection**: Iterate through prioritized models.
564+
* **Failover**: Automatically retry with the next model on `429 ResourceExhausted` errors.
565+
* **Rate Limiting (New)**: Persistently track RPM (Requests Per Minute) and RPD (Requests Per Day) limit usage locally in `~/.knowcode/usage_stats.json` to avoid API bans.
566+
* **Multi-Provider Support**:
567+
* **Google Gemini**: Native `google.genai` client.
568+
* **OpenAI/OpenRouter**: Generic `openai` client support (e.g. Mistral via OpenRouter).
569+
* **Reasoning Loop (ReAct)**: Dynamic capability to call tools (`list_files`, `find_references`, `search_history`) to disambiguate queries or explore before answering.
570+
* **Temporal Integration**: Query `TemporalAnalyzer` to answer "why" and "when" questions based on git history.
571+
* **Structured Output**: Support JSON/YAML schemas for automation tasks.
572+
* **Task-Aware Context**: Dynamically adjust context prioritization (debug vs. explain) based on user intent.
573+
574+
### **Inputs**
575+
576+
* `aimodels.yaml` configuration
577+
* User query
578+
* Retrieved context bundle
579+
580+
### **Outputs**
581+
582+
* LLM Answer
583+
* Updated usage statistics
584+
585+
### **Downstream Consumers**
586+
587+
* `knowcode ask` command
588+
* External IDE agents via MCP (Layer 10b)
589+
590+
---
591+
592+
## **10b. [NEW] Tool Exposure Layer (MCP)**
593+
594+
### **Purpose**
595+
596+
Expose KnowCode's intelligence capabilities as **callable tools** for external AI agents (e.g., IDE-integrated agents like Google's Antigravity) via the Model Context Protocol (MCP).
597+
598+
### **Responsibilities**
599+
600+
* **MCP Server**: Run a compliant MCP server discoverable by IDE agents.
601+
* **Tool Registration**: Expose structured tools aligned with Layer 8 query types.
602+
* **Sufficiency Scoring**: Return confidence metrics so agents can decide whether to use external LLMs.
603+
* **Structured Responses**: JSON schemas for programmatic consumption.
604+
* **[HARDENED]** Tool versioning for backward compatibility.
605+
* **[HARDENED]** Rate limiting per-tool for resource protection.
606+
* **[HARDENED]** Telemetry for tool usage analytics.
607+
608+
### **Exposed Tools**
609+
610+
```yaml
611+
Tools:
612+
- name: search_codebase
613+
description: "Semantic + lexical search for code entities"
614+
parameters: { query: string, limit: int }
615+
returns: List of {entity_id, name, kind, file, snippet, score}
616+
617+
- name: get_entity_context
618+
description: "Token-budgeted context bundle with sufficiency score"
619+
parameters: { entity_id: string, max_tokens: int, task_type: debug|refactor|extend|review }
620+
returns: {context_text, included_entities, sufficiency_score, token_count}
621+
622+
- name: trace_calls
623+
description: "Multi-hop call graph traversal"
624+
parameters: { entity_id: string, direction: callers|callees, depth: int }
625+
returns: List of {entity, call_depth, file, line}
626+
627+
- name: get_impact
628+
description: "Deletion impact analysis"
629+
parameters: { entity_id: string }
630+
returns: {direct_dependents, transitive_dependents, risk_score}
631+
632+
- name: explain_flow
633+
description: "Step-by-step execution trace"
634+
parameters: { entry_point: string, max_depth: int }
635+
returns: {steps: [{entity, description, code_snippet}]}
636+
```
637+
638+
### **Inputs**
639+
640+
* MCP tool invocation from external agent
641+
* Tool parameters
642+
643+
### **Outputs**
644+
645+
* Structured JSON responses
646+
* Sufficiency scores for context adequacy
647+
* Token estimates for budget planning
648+
649+
### **Downstream Consumers**
650+
651+
* External IDE agents (Antigravity, Cursor, etc.)
652+
* CI/CD pipelines
653+
* Automation scripts
654+
655+
---
656+
657+
## **11. Feedback, Validation & Evolution Layer**
552658

553659
### **Purpose**
554660

@@ -746,7 +852,60 @@ You've essentially defined a **code intelligence system**, not a chatbot with em
746852
19. **[ ] Scalability**: Large monorepo support and distributed processing.
747853
20. **[ ] Team Sharing**: Remote knowledge store sync and collaboration.
748854

855+
### **Phase 7: Agentic Capabilities (IN PROGRESS)**
856+
21. **[x] Agent Architecture**: `Agent` class with configuration-driven model selection.
857+
22. **[x] Multi-Provider Support**: Google Gemini and OpenRouter/OpenAI integration.
858+
23. **[x] Rate Limiting**: Persistent RPM/RPD tracking and enforcement.
859+
24. **[ ] Tool Use (ReAct)**: file listing, reference searching, history querying.
860+
25. **[ ] Advanced Retrieval**: Cross-Encoder reranking and graph inquiry.
861+
862+
### **Phase 8: IDE Integration (PLANNED)**
863+
26. **[ ] MCP Server (Layer 10b)**: Tool exposure for external IDE agents.
864+
27. **[ ] Sufficiency Scoring**: Context confidence metrics for local-first answering.
865+
28. **[ ] Task-Specific Templates**: Implement debug/refactor/extend/review context prioritization.
866+
29. **[ ] Multi-hop Queries**: Reachability and impact analysis with configurable depth.
867+
30. **[ ] Structured Responses**: JSON schema support across all endpoints.
868+
749869
### **Supporting Tooling & QA (COMPLETED)**
750870
- **[x] Tests**: Unit/integration/e2e coverage for parsing, indexing, retrieval, API, CLI, storage, and analysis.
751871
- **[x] CI/CD**: Ruff linting, pytest + coverage, MkDocs build, and automated changelog generation.
752872
- **[x] Evaluation Utilities**: Retrieval-quality evaluation script (`scripts/evaluate.py`).
873+
874+
---
875+
876+
## **Primary Use-Cases**
877+
878+
### **Use-Case 1: Developer Q&A with Detailed Answers**
879+
880+
> As a developer, I want to ask questions about my codebase in plain English and get detailed, step-by-step answers with code snippets.
881+
882+
**Workflow**:
883+
1. Developer asks: "Explain what happens when 'knowcode ask' runs"
884+
2. System identifies question type (explanation)
885+
3. Agent retrieves relevant entities via semantic search
886+
4. Context synthesizer builds token-budgeted bundle
887+
5. LLM generates step-by-step explanation with code snippets
888+
889+
**Key Capabilities Required**:
890+
- Query-type detection (Layer 10a)
891+
- Task-specific templates (Layer 9)
892+
- Multi-hop call graph traversal (Layer 8)
893+
- ReAct tool-use for complex queries (Layer 10a)
894+
895+
### **Use-Case 2: IDE Agent Integration for Token Efficiency**
896+
897+
> When prompting an IDE agent (e.g., Antigravity), it invokes KnowCode tools to retrieve context locally, minimizing expensive external LLM token usage.
898+
899+
**Workflow**:
900+
1. User prompts IDE agent
901+
2. IDE agent invokes KnowCode tools via MCP
902+
3. KnowCode returns context with sufficiency score
903+
4. If score >= 0.8: Agent answers locally (zero external tokens)
904+
5. If score < 0.8: Agent uses returned context with external LLM (controlled tokens)
905+
906+
**Key Capabilities Required**:
907+
- MCP Server (Layer 10b)
908+
- Sufficiency scoring (Layer 9)
909+
- Structured tool responses (Layer 10b)
910+
- Token budget reporting (Layer 9)
911+

README.md

Lines changed: 22 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -24,8 +24,8 @@ source .venv/bin/activate # On Windows: .venv\Scripts\activate
2424
# Install KnowCode (with dev dependencies)
2525
uv sync --dev
2626

27-
# Set OpenAI API Key (required for semantic search and 'ask' command)
28-
export GOOGLE_API_KEY="sk-..."
27+
# Set Google API Key (required for semantic search and 'ask' command)
28+
export GOOGLE_API_KEY="AIza..."
2929
```
3030

3131
## Quick Start
@@ -141,7 +141,6 @@ knowcode semantic-search <query> [--index <path>] [--limit <n>]
141141
```bash
142142
knowcode semantic-search "Where is the graph built?"
143143
```
144-
```
145144

146145
### `server`
147146
Start the FastAPI intelligence server. This is the preferred way for locally hosted AI agents (IDEs) to interact with KnowCode.
@@ -181,7 +180,26 @@ knowcode history "KnowledgeStore"
181180
Ask questions about the codebase using an LLM agent. Requires `GOOGLE_API_KEY` environment variable.
182181

183182
```bash
184-
knowcode ask <question> [--model <model>]
183+
knowcode ask <question> [--config <path>]
184+
```
185+
186+
**Configuration:**
187+
KnowCode looks for a configuration file in the following order:
188+
1. `--config` argument
189+
2. `aimodels.yaml` in current directory
190+
3. `~/.aimodels.yaml`
191+
192+
**Example `aimodels.yaml`:**
193+
```yaml
194+
models:
195+
- name: gemini-2.0-flash-lite
196+
provider: google
197+
api_key_env: GOOGLE_API_KEY
198+
rpm_free_tier_limit: 10
199+
rpd_free_tier_limit: 1000
200+
- name: gemini-1.5-flash
201+
provider: google
202+
api_key_env: GOOGLE_API_KEY
185203
```
186204
187205
**Example:**

0 commit comments

Comments
 (0)