This branch contains the evaluation resources for Developer Lightspeed 1.9, including the source RAG documentation, the generated datasets, and the performance results across multiple LLMs.
| Path | Description |
|---|---|
| ๐ rhdh-product-docs | The Red Hat Developer Hub (RHDH) product documentation (v1.9) used as the source knowledge base for this evaluation. |
| ๐ dataset | Contains raw and processed Q&A pairs, formatted specifically for the lightspeed-evaluation tool. |
| ๐ evaluation-result | Detailed metrics and outcome reports from the model evaluations. |
| ๐ categories_rhdh.yaml | Manually defined topic groups used to classify and organize the Q&A pairs. |
For the Developer Lightspeed 1.9 release, we ran evaluation against five distinct models.
Models Evaluated:
- Gemini:
gemini-2.5-pro,gemini-2.5-flash-lite - GPT:
gpt-4o-mini,gpt-5.2 - Llama:
llama3.1:8b
Judge Model being used: gemini-2.5-pro
๐ View Results: For a deep dive into the performance metrics, please refer to the Evaluation Results directory.
The dataset in this repository was constructed using a synthetic generation pipeline to ensure comprehensive coverage of the documentation.
- Source Material: The dataset is derived entirely from the RHDH 1.9 Product Docs.
- Generation Tool: We used Ragas (Testset Generation for RAG) to generate diverse Q&A pairs.
- Evaluation Tool: The evaluation was executed using the lightspeed-evaluation tool, which consumes the dataset and calculates performance metrics.