Retriever harness session tooling#1495
Merged
jioffe502 merged 3 commits intoNVIDIA:mainfrom Mar 6, 2026
Merged
Conversation
jdye64
approved these changes
Mar 6, 2026
- stabilize query_csv resolution and financebench defaults - add tags plus summary and compare session commands - clean recall metric keys and validate jp20 recall e2e Signed-off-by: Jacob Ioffe <jioffe@nvidia.com>
- remove cwd fallback from relative query_csv resolution - simplify /raid dataset fallback to use current user only Signed-off-by: Jacob Ioffe <jioffe@nvidia.com>
- Mock financebench query_csv fixture path in Path.exists stub - Prevent CI-only validation failures from missing local fixture file - Keep test focused on dataset_dir fallback behavior Signed-off-by: Jacob Ioffe <jioffe@nvidia.com>
ec3ef3c to
c301bbe
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
TLDR
This updates the
nemo_retrieverharness to make dataset/query path resolution more reliable and adds lightweight session review commands for harness runs.It also validates the new artifact flow with focused unit coverage and a real
jp20e2e recall run.Description
query_csvvalues and/raid/$USERdataset fallbacks, and enable the default FinanceBench query fixture--tagsupport plusretriever harness summaryandretriever harness comparecommands for session inspectionpython -m nemo_retriever.harnessruntime warning via a small CLI refactorTest plan
pytest -q nemo_retriever/tests/test_harness_parsers.py nemo_retriever/tests/test_harness_config.py nemo_retriever/tests/test_harness_run.py nemo_retriever/tests/test_harness_reporting.py nemo_retriever/tests/test_harness_recall_adapters.py nemo_retriever/tests/test_recall_core.pypython -m nemo_retriever.harness nightly --dry-run --tag nightlypython -m nemo_retriever.harness run --dataset jp20 --preset single_gpu --run-name jp20_integration_check_cleanedChecklist