RAG-based project for building a domain knowledge base and answering questions via API or web UI.
- Linux
- Python 3.10+ (recommended)
pip- (Optional) local LLM runtime (e.g., Ollama) if configured in
private_settings.py
python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txtEdit private_settings.py to set API keys and runtime options (local vs online model usage).
Use the Streamlit entrypoint:
streamlit run streamlit_ui.pyInstantiate the Guru class from:
orchestrator/guru.py
Example (minimal), from answer_question.py:
from orchestrator.guru import Guru
guru = Guru(...)
response = guru.user_message("Your question here")
print(response)Guru is the main entry point for question answering.
provider(str)
LLM backend provider (e.g.,"ollama"or"openai").model(str)
Chat/model name (e.g.,"gpt-oss:120b"or"gpt-4").embedding(str)
Embedding model name (e.g.,"mxbai-embed-large"or"text-embedding-3-small").language(str)
Response language (e.g.,"english").temperature(int | float)
Generation temperature (example in project:0).answer_length(str)
Output style/length (example:"compact").knowledge_base(str)
Knowledge base storage folder (similar concept used in KB creation, e.g.,"Switzerland").
Example with explicit parameters:
from orchestrator.guru import Guru
guru = Guru(
provider="ollama",
model="gpt-oss:120b",
embedding="mxbai-embed-large",
language="english",
temperature=0,
answer_length="compact",
knowledge_base="Switzerland",
use_knowledge: bool = True
)Primary method used in this project:
user_message(question)
Input:
question(str): the user request/question in natural language.
Example:"How can I reduce heating energy consumption at home?"
Output:
response(str): generated answer text from the RAG pipeline, ready to be shown to the user or returned by an API endpoint.
Minimal usage flow:
- Create a
Guruinstance with your project configuration. - Pass a question string to
user_message(...). - Return or print the resulting answer string.
Run the knowledge base creator script:
python build_knowledge_base.pyYou can run the benchmark with:
python run_benchmark.pybuild_knowledge_base.py— build/update the knowledge base from sourcesanswer_question.py— CLI-style question answering entrypointstreamlit_ui.py— web interfacerun_benchmark.py— benchmark runnerorchestrator/guru.py— main orchestrator class (Guru)knowledge_base/— extraction and storage logicllm/— LLM integration layer
MDER-DR_RAG/
├── answer_question.py # CLI-style question answering entrypoint
├── build_knowledge_base.py # Build/update the knowledge base from sources
├── run_benchmark.py # Benchmark runner
├── streamlit_ui.py # Web interface (Streamlit)
├── private_settings.py # Local/private runtime settings
├── requirements.txt # Python dependencies
├── readme.md
├── LICENSE
├── benchmark/
│ ├── __init__.py
│ └── benchmark.py
├── knowledge_base/
│ ├── __init__.py
│ ├── knowledge_extractor.py # Extracts content and constructs the knowledge graph
│ ├── knowledge_manager.py # Loads/searches knowledge in the graph at query time
│ ├── data/ # Stored graph files / serialized KB artifacts
│ └── utils/ # Helper modules used by KB build/query logic
│ ├── energenius_graph.py
│ ├── graph_helpers.py
│ ├── graph_parameter.py
│ ├── graph_prompt.py
│ └── syntactic_disambiguator.py
├── llm/
│ ├── __init__.py
│ └── langchain.py # LLM + embedding integration layer
├── orchestrator/
│ ├── __init__.py
│ ├── abstract_orchestrator.py
│ ├── guru.py # Main API orchestrator class (Guru)
│ └── live_orchestrator.py
└── data/
knowledge_base/knowledge_extractor.pyis used to create/build the graph.knowledge_base/knowledge_manager.pyis used to retrieve/search knowledge in the graph.knowledge_base/data/stores graph artifacts.knowledge_base/utils/contains helper utilities for graph creation and processing.
- Create and activate virtual environment
- Install dependencies from
requirements.txt - Configure
private_settings.py - Build KB with
python build_knowledge_base.py(or copy an existing KB toknowledge_base/data/) - Run either:
- Web UI:
streamlit run streamlit_ui.py - API integration: instantiate
Guruin your application/tests - Benchmark:
python run_benchmark.py
- Web UI:
See LICENSE.
@misc{campi2026mderdrmultihopquestionanswering,
title={MDER-DR: Multi-Hop Question Answering with Entity-Centric Summaries},
author={Riccardo Campi and Nicolò Oreste Pinciroli Vago and Mathyas Giudici and Marco Brambilla and Piero Fraternali},
year={2026},
eprint={2603.11223},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2603.11223},
}