Agentic RAG (Retrieval-Augmented Generation) CLI

An interactive command-line assistant that answers questions using local documents and web search, powered by Groq LLMs and LangGraph. It supports memory, reranking, and citation-based answers with source attribution.

Features

Retrieval-Augmented Generation — Combines vector search (Chroma) and BM25 with optional web search for robust answers.
Source attribution — Answers cite sources with labels: [L1] (local docs), [W1] (web), [M1] (memory).
Memory store — Optional recall and storage of facts across sessions.
Web search — DuckDuckGo for live web results.
Document ingestion — PDF and text files in data/, chunked and indexed for retrieval.
Reranking — Cross-encoder to select the most relevant context before generation.
Interactive CLI — Rich interface with session management and command history.
Critic module — Verifies factual claims against evidence; can trigger a second pass (e.g. web) when more evidence is needed.

Requirements

Python 3.12+
Groq API key (required for the LLM)

Setup

Clone and install dependencies

cd Agentic-Rag-for-Harry-Potter
uv sync
# or: pip install -e .

Configure environment
```
cp .env.example .env
```
Edit .env and set at least:
- GROQ_API_KEY — get one at Groq Console.
Ingest documents

Place PDFs or text files in data/ (e.g. data/harrypotter.pdf), then run:
```
python -m src.ingest
```
This builds the vector index (Chroma), BM25 index, and parent map under ./storage/. Re-run after adding or changing documents.

Usage

Interactive CLI (default)

python main.py

Type a question and press Enter. Use:

/new — start a new session (new thread id)
/id — show current session id
/quit — exit
/help — list commands
/clear-mem — clear long-term memories

Single question (non-interactive)

python main.py -q "Who is Harry Potter?"
python main.py --thread-id my-session -q "What house is Harry in?"

Environment variables

Variable	Required	Default	Description
`GROQ_API_KEY`	Yes	—	Groq API key for the LLM.
`RAG_USE_MEMORY`	No	`1`	Use memory recall: `1` or `0`.
`RAG_ALLOW_MEMORY_CITATIONS`	No	`0`	Allow citing memory (M#) in answers: `1` or `0`.
`RAG_REMEMBER_ANSWERS`	No	`0`	Persist answers into memory after each run: `1` or `0`.

See .env.example for a copy-paste template.

Project layout

├── main.py              # Entry point: CLI, single-question mode
├── data/                 # Documents to ingest (PDF, txt, etc.)
├── storage/              # Created by ingest: Chroma, BM25, parent map
├── state.sqlite          # LangGraph checkpoint state (created on first run)
├── src/
│   ├── graph.py          # LangGraph pipeline: plan → retrieve → maybe_web → rerank → answer → critic
│   ├── ingest.py         # Document chunking, embeddings, Chroma + BM25
│   ├── retrieval.py     # Vector + BM25 search, RRF merge, parent expand
│   ├── compression.py   # Contextual compression of snippets
│   ├── rerank.py        # Cross-encoder reranking
│   ├── webtools.py      # Web search (DuckDuckGo) + page fetch
│   ├── llm.py           # Groq chat wrapper
│   └── memory_store.py  # Long-term memory (recall / remember)
├── .env.example
└── README.md

How it works

Plan — Classifies the query as local-only or web-augmented (e.g. “actor”, “movie”, “today” → web).
Retrieve — Vector + BM25 over storage/, RRF merge, parent expansion, contextual compression; optional memory recall.
Maybe web — If routed to web, runs search, fetches pages, compresses and adds as W# sources.
Rerank — Cross-encoder narrows context to top snippets (keeps at least one web snippet when web was used).
Answer — LLM generates an answer with citations only from the allowed labels.
Critic — Second LLM checks facts against evidence; if it returns NEED_MORE:, the graph loops back (e.g. forces a web pass) and continues.

State is checkpointed in state.sqlite per --thread-id, so multi-turn and sessions are stable across runs.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
data		data
src		src
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
.python-version		.python-version
.rag_history		.rag_history
README.md		README.md
main.py		main.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agentic RAG (Retrieval-Augmented Generation) CLI

Features

Requirements

Setup

Usage

Environment variables

Project layout

How it works

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Agentic RAG (Retrieval-Augmented Generation) CLI

Features

Requirements

Setup

Usage

Environment variables

Project layout

How it works

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages