huggingface / lighteval Public

Notifications You must be signed in to change notification settings
Fork 475
Star 2.4k

Code
Issues 212
Pull requests 104
Actions
Projects
Security and quality
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security and quality
Insights

Pull requests: huggingface/lighteval

Labels 15 Milestones 0

New pull request New

104 Open 669 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Fix flatten_dict crash/wrong key for bare numpy array values

#1247 opened May 28, 2026 by Kymi808

Loading…

chore: enable Dependabot weekly GitHub Actions bumps dependabot

#1246 opened May 26, 2026 by hf-dependantbot-rollout Bot

Loading…

Add ArxivRollBench tasks

#1245 opened May 24, 2026 by liangzid

Loading…

feat: implement loglikelihood and loglikelihood_rolling for LiteLLMClient (closes #1093)

#1244 opened May 21, 2026 by ALI-AL-MARJANI

Loading…

Fix callable type hint in parallelism helper

#1239 opened May 20, 2026 by GoparapukethaN

Loading…

fix: guard choices[0] and message=None before content access in llm_as_judge

#1238 opened May 17, 2026 by qizwiz

Loading…

docs: fix custom model examples

#1237 opened May 15, 2026 by MukundaKatta

Loading…

fix: prevent IndexError in Doc.get_golds() for out-of-bounds gold_index

#1236 opened May 13, 2026 by AmSach

Loading…

fix typo

#1235 opened May 8, 2026 by fpetrakov

Loading…

fix: transpose references before passing to sacrebleu in CorpusLevelTranslationMetric

#1234 opened May 8, 2026 by jaydenC88

Loading…

2 tasks done

Popotest patch 1

#1231 opened May 5, 2026 by popotest

Loading…

test: style-bot trigger

#1221 opened May 4, 2026 by paulinebm Contributor

Loading…

Add Bayes@N metric

#1219 opened Apr 29, 2026 by mohsenhariri

Loading…

Log per-sample details as trackio.Trace in push_to_wandb

#1217 opened Apr 27, 2026 by abidlabs Member

Loading…

Add LICA-Bench: graphic design VLM evaluation (39 tasks, 7 domains)

#1212 opened Apr 15, 2026 by purvanshi

Loading…

3 of 4 tasks

POLLUX LLM-Judge metric

#1210 opened Apr 10, 2026 by ulyanaisaeva

Loading…

catch task has no docs instead of throw

#1207 opened Apr 8, 2026 by BuiHoangTu

Loading…

add multilingual flag to vllm

#1206 opened Apr 8, 2026 by BuiHoangTu

Loading…

fix(vllm): Enhance VLLMModel context size handling for batch inputs

#1205 opened Apr 6, 2026 by paulovsantanas

Loading…

examples: add RAIL Score responsible AI custom task template

#1203 opened Apr 2, 2026 by SumitVermakgp

Loading…

Add --load-tasks-multilingual and fix --custom-tasks for inspect backend

#1199 opened Mar 25, 2026 by dzautner

Loading…

4 tasks done

[Bugfix] Check all responses when n>1 instead of only the first one

#1197 opened Mar 23, 2026 by eldarkurtic Contributor

Loading…

[Litellm Enhancement] Enable extra sampling args for litellm backend

#1195 opened Mar 20, 2026 by eldarkurtic Contributor

Loading…

[Bugfix] presence_penalty is silently dropped from sampling args in litellm backend

#1193 opened Mar 18, 2026 by eldarkurtic Contributor

Loading…

[Bugfix] litellm backend should iterate over docs in a split not entire dataset

#1192 opened Mar 18, 2026 by eldarkurtic Contributor

Loading…

Previous 1 2 3 4 5 Next

Previous Next

ProTip! Follow long discussions with comments:>50.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!