Extend ObjectStore with structured data reading for evaluation by dnandakumar-nv · Pull Request #1590 · NVIDIA/NeMo-Agent-Toolkit

dnandakumar-nv · 2026-02-10T23:05:19Z

Description

Add read_dataframe() to the ObjectStore interface so evaluation datasets can be loaded through the existing ObjectStore plugin system
Add FileObjectStore implementation for local filesystem access
Add LangSmithObjectStore for loading evaluation datasets from the LangSmith API
Add format parsing utilities supporting CSV, JSON, JSONL, Parquet, and Excel
Introduce EvalDatasetConfig — a single flat configuration for evaluation datasets that supports both local files and named ObjectStore references

Motivation

Evaluation dataset loading should reuse the existing ObjectStore infrastructure rather than introducing a separate abstraction. ObjectStore already solves the "where does data come from" problem (S3, MySQL, Redis, in-memory), but it only speaks bytes. By adding a read_dataframe() method, ObjectStore gains the ability to parse structured data — which is exactly what eval needs.

This cleanly separates where data comes from (ObjectStore implementations) from what format it's in (format parsers), and makes it trivial to add new data sources (implement ObjectStore) or new formats (add a line to the parser registry).

Design

ObjectStore.read_dataframe(key, format, **kwargs) — Non-abstract method on the existing ABC. Default implementation calls get_object() to fetch bytes, infers format from the key's file extension, and parses via format parsers. All existing ObjectStore implementations (S3, MySQL, Redis, in-memory) gain this capability for free. Subclasses can override for efficiency (e.g., FileObjectStore passes file paths directly to pandas, LangSmithObjectStore calls the API without a bytes intermediary).

FileObjectStore — New built-in ObjectStore for local filesystem access. Registered as _type: file. Overrides read_dataframe() to pass file paths directly to pandas readers instead of going through bytes + BytesIO.

LangSmithObjectStore — New ObjectStore in nvidia_nat_langchain that fetches evaluation datasets from the LangSmith API. Supports dataset name/ID lookup, splits, version tags, limits, and custom input/output key mapping. Read-only (put/upsert/delete raise NotImplementedError).

EvalDatasetConfig — Flat Pydantic model for dataset configuration. Two modes:

file_path: Local file shorthand (creates a transient FileObjectStore internally)
object_store + key: Reference a named ObjectStore from the workflow config

Format can be specified three ways (in order of precedence):

Explicit format: field (e.g., format: csv)
_type: field (accepted as an alias for format:, e.g., _type: json)
Inferred from the file extension when neither is specified

Configuration examples

Local file with _type (standard):

eval:
  general:
    dataset:
      _type: json
      file_path: /data/eval.json
      structure:
        question_key: input
        answer_key: expected_output

Local file with format inference (format omitted, inferred from .csv extension):

eval:
  general:
    dataset:
      file_path: /data/eval.csv
      structure:
        question_key: input
        answer_key: expected_output

S3 via ObjectStore:

object_stores:
  s3_data:
    _type: s3
    bucket: my-eval-datasets

eval:
  general:
    dataset:
      object_store: s3_data
      key: v2/eval.parquet

LangSmith via ObjectStore:

object_stores:
  langsmith:
    _type: langsmith

eval:
  general:
    dataset:
      object_store: langsmith
      key: my-eval-dataset

Files changed

New files:

nat/object_store/format_parsers.py — Format parsing utilities (CSV, JSON, JSONL, Parquet, Excel)
nat/object_store/file_object_store.py — FileObjectStore + FileObjectStoreConfig
nat/plugins/langchain/object_store/langsmith_object_store.py — LangSmithObjectStore + config
tests/nat/object_store/test_format_parsers.py — 12 tests
tests/nat/object_store/test_read_dataframe.py — 5 tests
tests/nat/object_store/test_file_object_store.py — 11 tests
tests/nat/eval/test_eval_dataset_config.py — 9 tests
tests/nat/eval/test_dataset_loading_integration.py — 8 integration tests
tests/object_store/test_langsmith_object_store.py — 16 tests (langchain package)

Modified files:

nat/object_store/interfaces.py — Added read_dataframe() to ObjectStore ABC
nat/data_models/dataset_handler.py — New EvalDatasetConfig
nat/data_models/evaluate.py — Updated EvalGeneralConfig.dataset field type
nat/eval/dataset_handler/dataset_handler.py — Rewired to load data via ObjectStore
nat/eval/evaluate.py — Updated caller for async dataset loading
nat/builder/builder.py — Cleaned up EvalBuilder ABC
nat/builder/eval_builder.py — Cleaned up WorkflowEvalBuilder
nat/plugins/langchain/register.py — Added LangSmith ObjectStore registration

By Submitting this PR I confirm:

I am familiar with the Contributing Guidelines.
We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
- Any contribution which contains commits that are not Signed-Off will not be accepted.
When the PR is ready for review, new or existing tests cover these changes.
When the PR is ready for review, the documentation is up to date with these changes.

Introduce dataset store registration system to streamline the handling of dataset configurations and loaders. This includes support for JSON, JSONL, CSV, Parquet, Excel, and custom dataset types. Updated the type registry and evaluation builder to integrate this feature, ensuring seamless interoperability and easier extension.``` Signed-off-by: dnandakumar-nv <dnandakumar@nvidia.com>

Documented the process for creating and registering custom dataset stores. Updated relevant sections to highlight built-in and extendable dataset formats. This enhances flexibility for users working with various dataset types. Signed-off-by: dnandakumar-nv <dnandakumar@nvidia.com>

Renamed all references from "dataset store" to "dataset loader" for clarity and consistency. Updates include class names, function names, documentation, and tests. This change aligns terminology with the primary purpose of loading datasets rather than storing them. Signed-off-by: dnandakumar-nv <dnandakumar@nvidia.com>

Introduced a new LangSmith dataset loader for fetching evaluation datasets via API. Added corresponding configurations, unit tests, and registration logic to integrate it into the framework. This includes support for customization (keys, splits, limits), ensuring full compatibility and backward support for YAML configurations. Signed-off-by: dnandakumar-nv <dnandakumar@nvidia.com>

Updated outdated links to the A2A protocol website across documentation. Improved clarity in the dataset loader section by refining descriptions and addressing minor inaccuracies. Signed-off-by: dnandakumar-nv <dnandakumar@nvidia.com>

The `dataset_loader` test package is no longer needed and has been completely removed. This cleanup helps reduce unnecessary code and improves maintainability of the repository. Signed-off-by: dnandakumar-nv <dnandakumar@nvidia.com>

Updated the dataset loader to handle datasets in the example data format specified in the LangSmith documentation. This ensures better compatibility and integration with LangSmith API datasets. Signed-off-by: dnandakumar-nv <dnandakumar@nvidia.com>

Replaced specific placeholder paths with more generic and user-friendly ones to improve clarity in example commands and configurations. This makes the examples easier to follow and adapt for users. Signed-off-by: dnandakumar-nv <dnandakumar@nvidia.com>

The dataset loader functionality and related tests have been removed as they are no longer relevant or required. Replacement mechanisms and updated configurations have been introduced for more streamlined dataset handling. Signed-off-by: dnandakumar-nv <dnandakumar@nvidia.com>

copy-pr-bot · 2026-02-10T23:05:23Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2026-02-10T23:05:26Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

dnandakumar-nv and others added 13 commits February 9, 2026 07:14

Merge branch 'develop' into dataset-store

d31615d

Merge branch 'develop' into dataset-store

9f9aff5

Merge branch 'develop' into dataset-store

c78608a

Remove unused dataset_loader test package

6c0ccbf

The `dataset_loader` test package is no longer needed and has been completely removed. This cleanup helps reduce unnecessary code and improves maintainability of the repository. Signed-off-by: dnandakumar-nv <dnandakumar@nvidia.com>

Merge branch 'develop' into dataset-store

9a8f4e1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extend ObjectStore with structured data reading for evaluation#1590

Extend ObjectStore with structured data reading for evaluation#1590
dnandakumar-nv wants to merge 13 commits intoNVIDIA:developfrom
dnandakumar-nv:dataset-store-object-store

dnandakumar-nv commented Feb 10, 2026 •

edited

Loading

Uh oh!

copy-pr-bot bot commented Feb 10, 2026

Uh oh!

coderabbitai bot commented Feb 10, 2026

Review skipped

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dnandakumar-nv commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation

Design

Configuration examples

Files changed

By Submitting this PR I confirm:

Uh oh!

copy-pr-bot bot commented Feb 10, 2026

Uh oh!

coderabbitai bot commented Feb 10, 2026

Review skipped

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

dnandakumar-nv commented Feb 10, 2026 •

edited

Loading