rubric-based-evaluation

Here are 3 public repositories matching this topic...

PabloCabaleiro / pondera

Pondera is a lightweight, YAML-first framework to evaluate AI models and agents with pluggable runners and an LLM-as-a-judge.

python ai agents model-agnostic ai-evaluation llms llm-evaluation llm-evaluation-framework llm-judge agent-evaluation ai-evaluation-framework rubric-based-evaluation yaml-first

Updated Oct 23, 2025
Python

abhinavag-svg / ai-coding-sessionprompt-analyzer

Star

Analyze Claude Code session logs and generate efficiency reports, cost diagnostics, and actionable recommendations. This project reads local JSONL session logs, computes deterministic efficiency signals, and can optionally add local LLM recommendations using Ollama.

python3 analyzer efficiency-analysis ai-code-review ollama claude-code rubric-based-evaluation composite-scoring

Updated Mar 12, 2026
Python

renataennes / llm-annotation-testset

Star

Bilingual LLM annotation dataset — EN/PT quality evaluation

python nlp annotation portuguese data-annotation bilingual cohen-kappa llm-evaluation rubric-based-evaluation

Updated Apr 13, 2026
Jupyter Notebook

Improve this page

Add a description, image, and links to the rubric-based-evaluation topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the rubric-based-evaluation topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly