Skip to content

Latest commit

 

History

History
41 lines (24 loc) · 2.49 KB

File metadata and controls

41 lines (24 loc) · 2.49 KB

Developer Lightspeed Evaluation

This branch contains the evaluation resources for Developer Lightspeed 1.9, including the source RAG documentation, the generated datasets, and the performance results across multiple LLMs.

📂 Repository Structure

Path Description
📂 rhdh-product-docs The Red Hat Developer Hub (RHDH) product documentation (v1.9) used as the source knowledge base for this evaluation.
📂 dataset Contains raw and processed Q&A pairs, formatted specifically for the lightspeed-evaluation tool.
📂 evaluation-result Detailed metrics and outcome reports from the model evaluations.
📄 categories_rhdh.yaml Manually defined topic groups used to classify and organize the Q&A pairs.

🧪 Evaluation Overview

For the Developer Lightspeed 1.9 release, we ran evaluation against five distinct models.

Models Evaluated:

  • Gemini: gemini-2.5-pro, gemini-2.5-flash-lite
  • GPT: gpt-4o-mini, gpt-5.2
  • Llama: llama3.1:8b

Judge Model being used: gemini-2.5-pro

📊 View Results: For a deep dive into the performance metrics, please refer to the Evaluation Results directory.


⚙️ Methodology & Generation

The dataset in this repository was constructed using a synthetic generation pipeline to ensure comprehensive coverage of the documentation.

  • Source Material: The dataset is derived entirely from the RHDH 1.9 Product Docs.
  • Generation Tool: We used Ragas (Testset Generation for RAG) to generate diverse Q&A pairs.
  • Evaluation Tool: The evaluation was executed using the lightspeed-evaluation tool, which consumes the dataset and calculates performance metrics.