Optimizer file store by dnandakumar-nv · Pull Request #1560 · NVIDIA/NeMo-Agent-Toolkit

dnandakumar-nv · 2026-02-04T17:52:29Z

Description

Closes

By Submitting this PR I confirm:

I am familiar with the Contributing Guidelines.
We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
- Any contribution which contains commits that are not Signed-Off will not be accepted.
When the PR is ready for review, new or existing tests cover these changes.
When the PR is ready for review, the documentation is up to date with these changes.

Refactor processor and router logic to simplify and streamline. Replaced prefix-based routing with session-based routing and introduced latency priority. Removed KV cache-specific logic, unused fields, and redundant code. Adjusted worker partitioning for simplified load management and priority handling. ``` Signed-off-by: dnandakumar-nv <dnandakumar@nvidia.com>

Introduced asynchronous logging for both input and output data in the processor and frontend modules. Logs are written to environment-specified files and the terminal, ensuring concurrency safety. This change enables better traceability of requests and responses for debugging and monitoring purposes. Signed-off-by: dnandakumar-nv <dnandakumar@nvidia.com>

This design introduces a storage abstraction layer for the prompt optimizer, enabling pluggable storage backends (local files, S3, etc.) while maintaining backward compatibility. Key additions include: - PromptStorage protocol with semantic save/load operations - LocalFileObjectStore implementation with sidecar metadata - ObjectStorePromptStorage and LocalFilePromptStorage implementations - Configuration changes to support optional object store usage Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> Signed-off-by: dnandakumar-nv <dnandakumar@nvidia.com>

- Create LocalFileObjectStore class implementing ObjectStore interface - Add test for basic instantiation - Stub out required methods (put, upsert, get, delete) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

@OverRide

- Add full Apache 2.0 license header to test file - Remove unused pytest import and fix import ordering - Add @OverRide decorators to all method implementations - Import exception classes proactively for future use - Run ruff fix to ensure linting compliance Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> Signed-off-by: dnandakumar-nv <dnandakumar@nvidia.com>

- Write data to {base_path}/{key} - Write metadata to {base_path}/{key}.meta as JSON - Raise KeyAlreadyExistsError if key exists - Create parent directories as needed Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

- Create new objects or overwrite existing ones - Write both data and metadata files - Create parent directories as needed Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

- Read data from {base_path}/{key} - Read metadata from {base_path}/{key}.meta if exists - Return ObjectStoreItem with all fields - Raise NoSuchKeyError if data file missing - Handle missing metadata gracefully (return None) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

- Delete data file at {base_path}/{key} - Delete metadata file at {base_path}/{key}.meta if exists - Raise NoSuchKeyError if data file doesn't exist - Add test for nested key paths (foo/bar/baz.json) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

- Add LocalFileObjectStoreConfig with base_path field - Register factory function for config-based instantiation - Add test for creating from config via WorkflowBuilder - LocalFileObjectStore is now fully functional and registered Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

- Define protocol with save/load methods for checkpoints and final prompts - Use domain-specific operations (generation-based) vs generic key-value - Design supports future resume/restart functionality Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

- Direct filesystem writes (backward compatible behavior) - Support optional key_prefix for subdirectory creation - Save/load checkpoints by generation number - Save/load final optimized prompts - Find latest checkpoint for future resume support Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

- Store prompts via ObjectStore interface (supports S3, etc.) - Auto-generate timestamp prefix if not provided - Serialize prompts as JSON with application/json content type - Include metadata (generation number, type) - Stub load_latest_checkpoint for future implementation Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

- Add ObjectStoreSettings model for object store configuration - Add optional object_store field to OptimizerConfig - Support name reference and optional key_prefix - Add comprehensive tests for config validation Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

- Add storage backend selection (LocalFile vs ObjectStore) - Replace direct file writes with storage.save_checkpoint() - Replace final prompt writes with storage.save_final() - Maintain backward compatibility (defaults to LocalFilePromptStorage) - Add error handling for storage failures - Remove unused json import Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

# Conflicts: # packages/nvidia_nat_core/src/nat/profiler/parameter_optimization/prompt_storage.py # packages/nvidia_nat_core/tests/nat/data_models/test_optimizer.py

This commit enhances the prompt optimizer checkpoint system to include rich performance metadata alongside optimized prompts. Key Changes: - Extended PromptStorage Protocol to accept fitness_score and evaluator_scores parameters in save_checkpoint() - Updated LocalFilePromptStorage to store metadata in embedded JSON structure with backward-compatible loading - Updated ObjectStorePromptStorage to store metadata in ObjectStoreItem metadata field - Modified prompt_optimizer.py to pass best individual's scalar_fitness and metrics when saving checkpoints - Added integration test configuration for testing metadata storage - Updated optimizer documentation with object store integration guide Benefits: - Full traceability of optimization progress across generations - Individual evaluator scores (accuracy, latency, etc.) tracked per checkpoint - Enables post-optimization analysis of which metrics drove performance - Metadata format supports both local filesystem and object store backends Testing: - Integration test confirms metadata is correctly stored for both generation checkpoints and final prompts - Backward compatibility maintained for loading old checkpoint format Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> Signed-off-by: dnandakumar-nv <dnandakumar@nvidia.com>

Document the new object store integration feature for prompt optimizer checkpoints, including: - Object store configuration in OptimizerConfig - Checkpoint metadata format with fitness and evaluator scores - Configuration examples for different storage backends - Key prefix organization for experiments This complements the optimizer object store integration added in the previous commits. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> Signed-off-by: dnandakumar-nv <dnandakumar@nvidia.com>

Create comprehensive guide for engineers implementing new object store backends that integrate with the prompt optimizer. The guide covers: - Architecture and integration points - ObjectStore interface contract and data models - Metadata format requirements (string-valued, JSON-in-string handling) - Implementation patterns (native metadata, sidecar files, database) - Testing requirements and common pitfalls - Real-world examples and registration process This enables external teams to build compatible object store backends without needing to understand prompt optimizer internals. Target audience: Engineers building S3, Azure Blob, GCS, PostgreSQL, or other storage backends for optimization checkpoint persistence. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> Signed-off-by: dnandakumar-nv <dnandakumar@nvidia.com>

copy-pr-bot · 2026-02-04T17:52:34Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2026-02-04T17:52:40Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

🔍 Trigger a full review

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Substituted the simplified router with a workload-aware, contextually adaptive model utilizing KV overlap and contextual Thompson Sampling. Added support for more detailed prefix tracking, routing metrics, debug tracing, and latency-based decision optimization. Removed legacy session-based logic and outdated utility methods to enhance extensibility and maintainability. Signed-off-by: dnandakumar-nv <dnandakumar@nvidia.com>

…tore

Substituted the simplified router with a workload-aware, contextually adaptive model utilizing KV overlap and contextual Thompson Sampling. Added support for more detailed prefix tracking, routing metrics, debug tracing, and latency-based decision optimization. Removed legacy session-based logic and outdated utility methods to enhance extensibility and maintainability. Signed-off-by: dnandakumar-nv <dnandakumar@nvidia.com>

Previously, evaluator scores were stored as a nested dict or JSON-encoded string in the metadata. Now they are flattened into individual metadata keys for easier access and querying. Before (object store): { "generation": "1", "fitness_score": "0.8542", "evaluator_scores": "{\"accuracy\": 0.85, \"token_efficiency\": 492.4}" } After (object store): { "generation": "1", "fitness_score": "0.8542", "accuracy": "0.85", "token_efficiency": "492.4", "llm_latency": "3.58" } This makes it easier to query and analyze individual evaluator scores without parsing JSON strings, and provides a cleaner metadata structure. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Consolidated multi-line function parameter definitions and object creation into single-line formats where appropriate to improve code readability. Adjusted indentation and formatting across test cases and core implementation files, ensuring consistency and maintaining functionality. Signed-off-by: dnandakumar-nv <dnandakumar@nvidia.com>

…ngSmith as a prompt management system Also includes some formatting changes from running checks Signed-off-by: Matthew Penn <mpenn@nvidia.com>

…penn_langsmith-object-store

Revised prompt storage logic to handle prompts individually with adjusted key formatting. Added support for building `ChatPromptTemplate` in LangSmith prompt store, ensuring proper handling and transformation of prompt data during storage and retrieval. Signed-off-by: dnandakumar-nv <dnandakumar@nvidia.com>

Updated `save_final` methods to include optional fitness scores and evaluator scores as metadata. Introduced stricter tag sanitization and transformation to ensure compatibility with LangSmith commit conventions, replacing invalid characters and adjusting formatting. Updated relevant usages to adopt these changes. Signed-off-by: dnandakumar-nv <dnandakumar@nvidia.com>

Replaced the local file-based object store with a LangSmith prompt store configuration in `config_optimizer_test.yml`. The new configuration includes API key, endpoint URL, and additional settings for better integration with LangSmith. This change ensures consistency with updated testing requirements. Signed-off-by: dnandakumar-nv <dnandakumar@nvidia.com>

…limitations Updates to optimizer to propagate prompt role through to handle system + human prompts Signed-off-by: Matthew Penn <mpenn@nvidia.com>

dnandakumar-nv and others added 19 commits January 30, 2026 09:16

feat(object-store): add LocalFileObjectStore skeleton

30f0fdf

- Create LocalFileObjectStore class implementing ObjectStore interface - Add test for basic instantiation - Stub out required methods (put, upsert, get, delete) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

feat(object-store): implement LocalFileObjectStore.put_object

3baacd5

- Write data to {base_path}/{key} - Write metadata to {base_path}/{key}.meta as JSON - Raise KeyAlreadyExistsError if key exists - Create parent directories as needed Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

feat(object-store): implement LocalFileObjectStore.upsert_object

6cbd6da

- Create new objects or overwrite existing ones - Write both data and metadata files - Create parent directories as needed Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Merge branch 'optimizer/file-store' into prompt-optimizer-object-store

5890af8

# Conflicts: # packages/nvidia_nat_core/src/nat/profiler/parameter_optimization/prompt_storage.py # packages/nvidia_nat_core/tests/nat/data_models/test_optimizer.py

dnandakumar-nv closed this Feb 4, 2026

dnandakumar-nv added 2 commits February 4, 2026 12:57

dnandakumar-nv reopened this Feb 4, 2026

dnandakumar-nv and others added 4 commits February 4, 2026 12:59

Merge remote-tracking branch 'upstream/develop' into optimizer/file-s…

1842673

…tore

Merge branch 'develop' into optimizer-file-store

f30160a

dnandakumar-nv added the feature request New feature or request label Feb 4, 2026

dnandakumar-nv added the non-breaking Non-breaking change label Feb 4, 2026

dnandakumar-nv and others added 8 commits February 4, 2026 17:53

Implements a langsmith_prompt_store ObjectStore to support using La…

a342073

…ngSmith as a prompt management system Also includes some formatting changes from running checks Signed-off-by: Matthew Penn <mpenn@nvidia.com>

Merge remote-tracking branch 'dhruv-fork/optimizer-file-store' into m…

9d973ce

…penn_langsmith-object-store

Modifications to langsmith_prompt_store to handle prompt tag scope …

ea0f8f6

…limitations Updates to optimizer to propagate prompt role through to handle system + human prompts Signed-off-by: Matthew Penn <mpenn@nvidia.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimizer file store#1560

Optimizer file store#1560
dnandakumar-nv wants to merge 33 commits intoNVIDIA:developfrom
dnandakumar-nv:optimizer-file-store

dnandakumar-nv commented Feb 4, 2026

Uh oh!

copy-pr-bot bot commented Feb 4, 2026

Uh oh!

coderabbitai bot commented Feb 4, 2026

Review skipped

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dnandakumar-nv commented Feb 4, 2026

Description

By Submitting this PR I confirm:

Uh oh!

copy-pr-bot bot commented Feb 4, 2026

Uh oh!

coderabbitai bot commented Feb 4, 2026

Review skipped

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants