DeepRetrieval is a minimal, production-oriented implementation of a Retrieval-Augmented Generation (RAG) agent exposed through a Model Context Protocol (MCP) server, built without using any external systems.
β¨ Key Features
-
π§ Custom RAG Pipeline
- Document chunking
- TF-IDF / classical vector search
- Context injection into prompts
-
π MCP Server (JSON-RPC)
- Fully compliant MCP tool server
- Stdio-based transport
-
π Agentic Tool Calling
- Explicit tool routing
- Deterministic execution
- No hidden orchestration
-
π Local & Web Search Tools
- Local document search
- Optional web search integration
-
βοΈ Zero Framework Dependency
-
π§± Use of ollama that is local llm use for seamless streaming of queries
User Query
β
Agent Controller
β
Tool Router
βββ Document Search Tool (TF-IDF)
βββ Web Search Tool
βββ Utility Tools
β
Context Composer
β
LLM (Answer strictly from provided context)
π Why This Project Matters
Most production AI systems do not rely on public agent frameworks. This project demonstrates how to build:
-
Reliable agents
-
Transparent reasoning
-
Auditable tool execution
-
Vendor-neutral architectures
βall from scratch.
- Built from scratch - No LangChain dependency shows deep understanding
- Proper architecture - Clean separation: chunking, embedding, retrieval, inference
- Working semantic search - Similarity scores (0.83, 0.74, 0.71) show it's finding relevant content
- Database integration - Vector storage with proper indexing
- Token-aware chunking - Smart overlap strategy preserves context
Aditya Katkar
