This directory contains the detailed evaluation results.
It includes comparative visualizations at the root level, as well as individual subdirectories containing the raw data and detailed metrics for each specific model.
The following graphs provide a summary of performance across all evaluated models and topics.
Detailed logs, CSVs, and specific metric breakdowns can be found in the respective folder for each model:
For this evaluation, every Q&A pair was tested against 5 specific metrics at the Turn Level.
See official Ragas documentation.
Response Evaluation
Context Evaluation
Response Evaluation
answer_correctness: A custom logic metric designed to validate the accuracy of the final response comparing with the expected_response.
Each model directory above contains standard output files generated by the lightspeed-evaluation tool. Use this guide to interpret the data.



