-
Notifications
You must be signed in to change notification settings - Fork 84
Open
Description
Hi, thanks for releasing LocAgent — great work!
A few quick questions:
-
Could you clarify how to use the fields in
Loc-Bench_V1when developing or evaluating new agents? A short description of each field and the label format would help a lot. -
For evaluation, is there a canonical JSONL format for the results? The
evaluation/run_evaluation.ipynbnotebook is helpful, but a concrete example or format spec would be great — especially for batch evaluation runs. -
Any plans for a leaderboard or a standard evaluation protocol for Loc-Bench?
Just want to make sure we're aligned with how the benchmark is intended to be used. Appreciate it!
Metadata
Metadata
Assignees
Labels
No labels