MDER-DR_RAG

RAG-based project for building a domain knowledge base and answering questions via API or web UI.

Requirements

Linux
Python 3.10+ (recommended)
pip
(Optional) local LLM runtime (e.g., Ollama) if configured in private_settings.py

Installation (venv + requirements.txt)

python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt

Configuration

Edit private_settings.py to set API keys and runtime options (local vs online model usage).

Run modes

1) Run web interface (Streamlit)

Use the Streamlit entrypoint:

streamlit run streamlit_ui.py

2) Run question answering directly

Instantiate the Guru class from:

orchestrator/guru.py

Example (minimal), from answer_question.py:

from orchestrator.guru import Guru

guru = Guru(...)
response = guru.user_message("Your question here")
print(response)

Guru class

Guru is the main entry point for question answering.

Parameters needed to instantiate `Guru`

provider (str)
LLM backend provider (e.g., "ollama" or "openai").
model (str)
Chat/model name (e.g., "gpt-oss:120b" or "gpt-4").
embedding (str)
Embedding model name (e.g., "mxbai-embed-large" or "text-embedding-3-small").
language (str)
Response language (e.g., "english").
temperature (int | float)
Generation temperature (example in project: 0).
answer_length (str)
Output style/length (example: "compact").
knowledge_base (str)
Knowledge base storage folder (similar concept used in KB creation, e.g., "Switzerland").

Example with explicit parameters:

from orchestrator.guru import Guru

guru = Guru(
    provider="ollama",
    model="gpt-oss:120b",
    embedding="mxbai-embed-large",
    language="english",
    temperature=0,
    answer_length="compact",
    knowledge_base="Switzerland",
    use_knowledge: bool = True
)

Inputs and outputs

Primary method used in this project:

user_message(question)

Input:

question (str): the user request/question in natural language.
Example: "How can I reduce heating energy consumption at home?"

Output:

response (str): generated answer text from the RAG pipeline, ready to be shown to the user or returned by an API endpoint.

Minimal usage flow:

Create a Guru instance with your project configuration.
Pass a question string to user_message(...).
Return or print the resulting answer string.

3) Create / rebuild knowledge base

Run the knowledge base creator script:

python build_knowledge_base.py

4) Run benchmark

You can run the benchmark with:

python run_benchmark.py

Main project files

build_knowledge_base.py — build/update the knowledge base from sources
answer_question.py — CLI-style question answering entrypoint
streamlit_ui.py — web interface
run_benchmark.py — benchmark runner
orchestrator/guru.py — main orchestrator class (Guru)
knowledge_base/ — extraction and storage logic
llm/ — LLM integration layer

Project tree

MDER-DR_RAG/
├── answer_question.py                # CLI-style question answering entrypoint
├── build_knowledge_base.py           # Build/update the knowledge base from sources
├── run_benchmark.py                  # Benchmark runner
├── streamlit_ui.py                   # Web interface (Streamlit)
├── private_settings.py               # Local/private runtime settings
├── requirements.txt                  # Python dependencies
├── readme.md
├── LICENSE
├── benchmark/
│   ├── __init__.py
│   └── benchmark.py
├── knowledge_base/
│   ├── __init__.py
│   ├── knowledge_extractor.py        # Extracts content and constructs the knowledge graph
│   ├── knowledge_manager.py          # Loads/searches knowledge in the graph at query time
│   ├── data/                         # Stored graph files / serialized KB artifacts
│   └── utils/                        # Helper modules used by KB build/query logic
│       ├── energenius_graph.py
│       ├── graph_helpers.py
│       ├── graph_parameter.py
│       ├── graph_prompt.py
│       └── syntactic_disambiguator.py
├── llm/
│   ├── __init__.py
│   └── langchain.py                  # LLM + embedding integration layer
├── orchestrator/
│   ├── __init__.py
│   ├── abstract_orchestrator.py
│   ├── guru.py                       # Main API orchestrator class (Guru)
│   └── live_orchestrator.py
└── data/

Notes

knowledge_base/knowledge_extractor.py is used to create/build the graph.
knowledge_base/knowledge_manager.py is used to retrieve/search knowledge in the graph.
knowledge_base/data/ stores graph artifacts.
knowledge_base/utils/ contains helper utilities for graph creation and processing.

Typical workflow

Create and activate virtual environment
Install dependencies from requirements.txt
Configure private_settings.py
Build KB with python build_knowledge_base.py (or copy an existing KB to knowledge_base/data/)
Run either:
- Web UI: streamlit run streamlit_ui.py
- API integration: instantiate Guru in your application/tests
- Benchmark: python run_benchmark.py

License

See LICENSE.

Citation

@misc{campi2026mderdrmultihopquestionanswering,
      title={MDER-DR: Multi-Hop Question Answering with Entity-Centric Summaries}, 
      author={Riccardo Campi and Nicolò Oreste Pinciroli Vago and Mathyas Giudici and Marco Brambilla and Piero Fraternali},
      year={2026},
      eprint={2603.11223},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2603.11223}, 
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MDER-DR_RAG

Requirements

Installation (venv + requirements.txt)

Configuration

Run modes

1) Run web interface (Streamlit)

2) Run question answering directly

Guru class

Parameters needed to instantiate `Guru`

Inputs and outputs

3) Create / rebuild knowledge base

4) Run benchmark

Main project files

Project tree

Notes

Typical workflow

License

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
benchmark		benchmark
docs		docs
knowledge_base		knowledge_base
llm		llm
orchestrator		orchestrator
LICENSE		LICENSE
README.md		README.md
answer_question.py		answer_question.py
build_knowledge_base.py		build_knowledge_base.py
private_settings.py		private_settings.py
requirements.txt		requirements.txt
run_benchmark.py		run_benchmark.py
streamlit_ui.py		streamlit_ui.py

Folders and files

Latest commit

History

Repository files navigation

MDER-DR_RAG

Requirements

Installation (venv + requirements.txt)

Configuration

Run modes

1) Run web interface (Streamlit)

2) Run question answering directly

Guru class

Parameters needed to instantiate Guru

Inputs and outputs

3) Create / rebuild knowledge base

4) Run benchmark

Main project files

Project tree

Notes

Typical workflow

License

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Parameters needed to instantiate `Guru`

Packages