Skip to content

Extend ObjectStore with structured data reading for evaluation#1590

Draft
dnandakumar-nv wants to merge 13 commits intoNVIDIA:developfrom
dnandakumar-nv:dataset-store-object-store
Draft

Extend ObjectStore with structured data reading for evaluation#1590
dnandakumar-nv wants to merge 13 commits intoNVIDIA:developfrom
dnandakumar-nv:dataset-store-object-store

Conversation

@dnandakumar-nv
Copy link
Contributor

@dnandakumar-nv dnandakumar-nv commented Feb 10, 2026

Description

  • Add read_dataframe() to the ObjectStore interface so evaluation datasets can be loaded through the existing ObjectStore plugin system
  • Add FileObjectStore implementation for local filesystem access
  • Add LangSmithObjectStore for loading evaluation datasets from the LangSmith API
  • Add format parsing utilities supporting CSV, JSON, JSONL, Parquet, and Excel
  • Introduce EvalDatasetConfig — a single flat configuration for evaluation datasets that supports both local files and named ObjectStore references

Motivation

Evaluation dataset loading should reuse the existing ObjectStore infrastructure rather than introducing a separate abstraction. ObjectStore already solves the "where does data come from" problem (S3, MySQL, Redis, in-memory), but it only speaks bytes. By adding a read_dataframe() method, ObjectStore gains the ability to parse structured data — which is exactly what eval needs.

This cleanly separates where data comes from (ObjectStore implementations) from what format it's in (format parsers), and makes it trivial to add new data sources (implement ObjectStore) or new formats (add a line to the parser registry).

Design

ObjectStore.read_dataframe(key, format, **kwargs) — Non-abstract method on the existing ABC. Default implementation calls get_object() to fetch bytes, infers format from the key's file extension, and parses via format parsers. All existing ObjectStore implementations (S3, MySQL, Redis, in-memory) gain this capability for free. Subclasses can override for efficiency (e.g., FileObjectStore passes file paths directly to pandas, LangSmithObjectStore calls the API without a bytes intermediary).

FileObjectStore — New built-in ObjectStore for local filesystem access. Registered as _type: file. Overrides read_dataframe() to pass file paths directly to pandas readers instead of going through bytes + BytesIO.

LangSmithObjectStore — New ObjectStore in nvidia_nat_langchain that fetches evaluation datasets from the LangSmith API. Supports dataset name/ID lookup, splits, version tags, limits, and custom input/output key mapping. Read-only (put/upsert/delete raise NotImplementedError).

EvalDatasetConfig — Flat Pydantic model for dataset configuration. Two modes:

  • file_path: Local file shorthand (creates a transient FileObjectStore internally)
  • object_store + key: Reference a named ObjectStore from the workflow config

Format can be specified three ways (in order of precedence):

  1. Explicit format: field (e.g., format: csv)
  2. _type: field (accepted as an alias for format:, e.g., _type: json)
  3. Inferred from the file extension when neither is specified

Configuration examples

Local file with _type (standard):

eval:
  general:
    dataset:
      _type: json
      file_path: /data/eval.json
      structure:
        question_key: input
        answer_key: expected_output

Local file with format inference (format omitted, inferred from .csv extension):

eval:
  general:
    dataset:
      file_path: /data/eval.csv
      structure:
        question_key: input
        answer_key: expected_output

S3 via ObjectStore:

object_stores:
  s3_data:
    _type: s3
    bucket: my-eval-datasets

eval:
  general:
    dataset:
      object_store: s3_data
      key: v2/eval.parquet

LangSmith via ObjectStore:

object_stores:
  langsmith:
    _type: langsmith

eval:
  general:
    dataset:
      object_store: langsmith
      key: my-eval-dataset

Files changed

New files:

  • nat/object_store/format_parsers.py — Format parsing utilities (CSV, JSON, JSONL, Parquet, Excel)
  • nat/object_store/file_object_store.pyFileObjectStore + FileObjectStoreConfig
  • nat/plugins/langchain/object_store/langsmith_object_store.pyLangSmithObjectStore + config
  • tests/nat/object_store/test_format_parsers.py — 12 tests
  • tests/nat/object_store/test_read_dataframe.py — 5 tests
  • tests/nat/object_store/test_file_object_store.py — 11 tests
  • tests/nat/eval/test_eval_dataset_config.py — 9 tests
  • tests/nat/eval/test_dataset_loading_integration.py — 8 integration tests
  • tests/object_store/test_langsmith_object_store.py — 16 tests (langchain package)

Modified files:

  • nat/object_store/interfaces.py — Added read_dataframe() to ObjectStore ABC
  • nat/data_models/dataset_handler.py — New EvalDatasetConfig
  • nat/data_models/evaluate.py — Updated EvalGeneralConfig.dataset field type
  • nat/eval/dataset_handler/dataset_handler.py — Rewired to load data via ObjectStore
  • nat/eval/evaluate.py — Updated caller for async dataset loading
  • nat/builder/builder.py — Cleaned up EvalBuilder ABC
  • nat/builder/eval_builder.py — Cleaned up WorkflowEvalBuilder
  • nat/plugins/langchain/register.py — Added LangSmith ObjectStore registration

By Submitting this PR I confirm:

  • I am familiar with the Contributing Guidelines.
  • We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
    • Any contribution which contains commits that are not Signed-Off will not be accepted.
  • When the PR is ready for review, new or existing tests cover these changes.
  • When the PR is ready for review, the documentation is up to date with these changes.

dnandakumar-nv and others added 13 commits February 9, 2026 07:14
Introduce dataset store registration system to streamline the handling of dataset configurations and loaders. This includes support for JSON, JSONL, CSV, Parquet, Excel, and custom dataset types. Updated the type registry and evaluation builder to integrate this feature, ensuring seamless interoperability and easier extension.```

Signed-off-by: dnandakumar-nv <dnandakumar@nvidia.com>
Documented the process for creating and registering custom dataset stores. Updated relevant sections to highlight built-in and extendable dataset formats. This enhances flexibility for users working with various dataset types.

Signed-off-by: dnandakumar-nv <dnandakumar@nvidia.com>
Renamed all references from "dataset store" to "dataset loader" for clarity and consistency. Updates include class names, function names, documentation, and tests. This change aligns terminology with the primary purpose of loading datasets rather than storing them.

Signed-off-by: dnandakumar-nv <dnandakumar@nvidia.com>
Introduced a new LangSmith dataset loader for fetching evaluation datasets via API. Added corresponding configurations, unit tests, and registration logic to integrate it into the framework. This includes support for customization (keys, splits, limits), ensuring full compatibility and backward support for YAML configurations.

Signed-off-by: dnandakumar-nv <dnandakumar@nvidia.com>
Updated outdated links to the A2A protocol website across documentation. Improved clarity in the dataset loader section by refining descriptions and addressing minor inaccuracies.

Signed-off-by: dnandakumar-nv <dnandakumar@nvidia.com>
The `dataset_loader` test package is no longer needed and has been completely removed. This cleanup helps reduce unnecessary code and improves maintainability of the repository.

Signed-off-by: dnandakumar-nv <dnandakumar@nvidia.com>
Updated the dataset loader to handle datasets in the example data format specified in the LangSmith documentation. This ensures better compatibility and integration with LangSmith API datasets.

Signed-off-by: dnandakumar-nv <dnandakumar@nvidia.com>
Replaced specific placeholder paths with more generic and user-friendly ones to improve clarity in example commands and configurations. This makes the examples easier to follow and adapt for users.

Signed-off-by: dnandakumar-nv <dnandakumar@nvidia.com>
The dataset loader functionality and related tests have been removed as they are no longer relevant or required. Replacement mechanisms and updated configurations have been introduced for more streamlined dataset handling.

Signed-off-by: dnandakumar-nv <dnandakumar@nvidia.com>
@copy-pr-bot
Copy link

copy-pr-bot bot commented Feb 10, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai
Copy link

coderabbitai bot commented Feb 10, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant