Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
4911a43
```
dnandakumar-nv Jan 30, 2026
a952bf4
Add input/output logging for processor and frontend.
dnandakumar-nv Feb 3, 2026
9a2e30b
Add design doc for prompt optimizer object store integration
dnandakumar-nv Feb 4, 2026
30f0fdf
feat(object-store): add LocalFileObjectStore skeleton
dnandakumar-nv Feb 4, 2026
4bf7519
fix(object-store): address code quality issues in LocalFileObjectStore
dnandakumar-nv Feb 4, 2026
3baacd5
feat(object-store): implement LocalFileObjectStore.put_object
dnandakumar-nv Feb 4, 2026
6cbd6da
feat(object-store): implement LocalFileObjectStore.upsert_object
dnandakumar-nv Feb 4, 2026
4f00975
feat(object-store): implement LocalFileObjectStore.get_object
dnandakumar-nv Feb 4, 2026
96ebec4
feat(object-store): implement LocalFileObjectStore.delete_object
dnandakumar-nv Feb 4, 2026
94c8c23
feat(object-store): add LocalFileObjectStore config and registration
dnandakumar-nv Feb 4, 2026
a1f52fc
feat(optimizer): add PromptStorage protocol
dnandakumar-nv Feb 4, 2026
9a78df6
feat(optimizer): implement LocalFilePromptStorage
dnandakumar-nv Feb 4, 2026
2817050
feat(optimizer): implement ObjectStorePromptStorage
dnandakumar-nv Feb 4, 2026
ccec9be
feat(optimizer): add ObjectStoreSettings to OptimizerConfig
dnandakumar-nv Feb 4, 2026
4ba8658
feat(optimizer): integrate PromptStorage into prompt optimizer
dnandakumar-nv Feb 4, 2026
5890af8
Merge branch 'optimizer/file-store' into prompt-optimizer-object-store
dnandakumar-nv Feb 4, 2026
9010d05
feat(optimizer): add fitness and evaluator scores to checkpoint metadata
dnandakumar-nv Feb 4, 2026
b30f7a7
docs(optimizer): add object store integration documentation
dnandakumar-nv Feb 4, 2026
ba71088
docs: add prompt optimizer object store implementer's guide
dnandakumar-nv Feb 4, 2026
ddc6a73
Refactor routing logic and introduce new routing model
dnandakumar-nv Feb 4, 2026
116f953
Refactor routing logic and introduce new routing model
dnandakumar-nv Feb 4, 2026
1842673
Merge remote-tracking branch 'upstream/develop' into optimizer/file-s…
dnandakumar-nv Feb 4, 2026
33821ec
Refactor routing logic and introduce new routing model
dnandakumar-nv Feb 4, 2026
f33c1cb
refactor(optimizer): flatten evaluator_scores in checkpoint metadata
dnandakumar-nv Feb 4, 2026
f30160a
Merge branch 'develop' into optimizer-file-store
dnandakumar-nv Feb 4, 2026
b8109fa
Refactor code for formatting consistency and readability
dnandakumar-nv Feb 4, 2026
a342073
Implements a `langsmith_prompt_store` ObjectStore to support using La…
mpenn Feb 4, 2026
9d973ce
Merge remote-tracking branch 'dhruv-fork/optimizer-file-store' into m…
mpenn Feb 4, 2026
c858cae
Refactor prompt storage and integrate ChatPromptTemplate.
dnandakumar-nv Feb 5, 2026
db89c56
Refactor prompt storage and integrate ChatPromptTemplate.
dnandakumar-nv Feb 5, 2026
43e95c6
Add metadata support when saving final prompts and sanitize tags
dnandakumar-nv Feb 5, 2026
3b61b4a
Update test object store configuration in optimizer test file
dnandakumar-nv Feb 5, 2026
ea0f8f6
Modifications to `langsmith_prompt_store` to handle prompt tag scope …
mpenn Feb 5, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
126 changes: 125 additions & 1 deletion docs/source/improve-workflows/optimizer.md
Original file line number Diff line number Diff line change
Expand Up @@ -326,6 +326,7 @@ optimizer:
This is the main configuration object for the optimizer.

- `output_path: Path | None`: The directory where optimization results will be saved, for example, `optimizer_results/`. Defaults to `None`.
- `object_store: ObjectStoreSettings | None`: Optional object store configuration for saving prompt checkpoints. When specified, checkpoints are stored in the configured object store instead of the local filesystem. This enables centralized storage, cloud backup, and integration with external storage systems. See [Object Store Integration](#object-store-integration) for details. Defaults to `None`.
- `eval_metrics: dict[str, OptimizerMetric] | None`: A dictionary of evaluation metrics to optimize. The keys are custom names for the metrics, and the values are `OptimizerMetric` objects.
- `numeric.enabled: bool`: Enable numeric optimization (Optuna). Defaults to `true`.
- `numeric.n_trials: int`: Number of numeric trials. Defaults to `20`.
Expand Down Expand Up @@ -517,12 +518,135 @@ optimizer:

During GA prompt optimization, the optimizer saves:

- `optimized_prompts_gen<N>.json`: Best prompt set after each generation.
- `optimized_prompts_gen<N>.json`: Best prompt set after each generation with performance metadata.
- `optimized_prompts.json`: Final best prompt set after all generations.
- `ga_history_prompts.csv`: Per-individual fitness and metric history across generations.

Numeric optimization outputs (Optuna) remain unchanged and can be used alongside GA outputs.

#### Checkpoint Metadata Format

Each checkpoint includes rich metadata to track optimization progress:

**When using local filesystem storage** (default), checkpoints are saved with embedded metadata:
```json
{
"metadata": {
"generation": 1,
"fitness_score": 0.8542,
"accuracy": 0.85,
"token_efficiency": 492.4,
"llm_latency": 3.58
},
"prompts": {
"functions.my_tool.prompt": [
"optimized prompt text",
"purpose description"
]
}
}
```

**When using object store storage**, metadata is stored in sidecar `.meta` files (for backends like LocalFileObjectStore) or in the object store's native metadata system (for backends like S3, Redis). This provides full traceability of how fitness and individual evaluator scores evolve across generations.

## Object Store Integration

The prompt optimizer supports storing checkpoints in object stores, enabling centralized storage, cloud backup, and integration with external storage systems. This is particularly useful for:

- **Distributed Teams**: Share optimization results via cloud storage (S3, Azure Blob, GCS)
- **Long-Running Optimizations**: Persist checkpoints for resume capability
- **Centralized Storage**: Store checkpoints in databases (Redis, MySQL) for analytics
- **Backup and Recovery**: Automatic backup of optimization progress

### Configuration

To enable object store storage, add an `object_store` field to your optimizer configuration:

```yaml
# Define the object store backend
object_stores:
my_checkpoint_store:
_type: local_file # or s3, redis, mysql
base_path: ./optimization_checkpoints

optimizer:
output_path: "optimizer_results"

# Configure object store for checkpoint storage
object_store:
name: my_checkpoint_store # Reference to object store defined above
key_prefix: my_experiment_v1 # Optional: subdirectory/prefix for organization

prompt:
enabled: true
ga_population_size: 16
ga_generations: 8

eval_metrics:
accuracy:
evaluator_name: "accuracy"
direction: "maximize"
weight: 0.8
```

### Object Store Backends

The optimizer supports any registered object store backend. Common options:

**Local Filesystem** (`local_file`):
```yaml
object_stores:
local_store:
_type: local_file
base_path: ./checkpoints
```

**Amazon S3** (`s3`):
```yaml
object_stores:
s3_store:
_type: nat.plugins.s3/s3
bucket_name: my-optimization-checkpoints
region: us-west-2
```

**Redis** (`redis`):
```yaml
object_stores:
redis_store:
_type: nat.plugins.redis/redis
host: localhost
port: 6379
db: 0
```

### Key Prefix Organization

Use `key_prefix` to organize checkpoints by experiment, model, or dataset:

```yaml
optimizer:
object_store:
name: s3_store
key_prefix: experiments/chatbot_v2/baseline # Creates subdirectories
```

This creates keys like: `experiments/chatbot_v2/baseline/optimized_prompts_gen1.json`

### Metadata Storage

Object stores preserve full optimization metadata including:
- Generation number
- Fitness score (overall optimization objective)
- Individual evaluator scores (accuracy, latency, token efficiency, etc.)

This metadata enables:
- **Progress Tracking**: Monitor fitness improvement across generations
- **Evaluator Analysis**: Understand which metrics drive performance
- **Reproducibility**: Full record of optimization trajectory

For more information on object stores, see the [Object Store documentation](../build-workflows/object-store.md).

## Running the Optimizer

Once you have your optimizer configuration and optimizable fields set up, you can run the optimizer from the command line using the `nat optimize` command.
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
# SPDX-FileCopyrightText: Copyright (c) 2024-2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

general:
telemetry:
logging:
console:
_type: console
level: WARN
file:
_type: file
path: ./.tmp/email_phishing_analyzer_test.log
level: DEBUG

# Object store for testing
object_stores:
test_store:
_type: langsmith_prompt_store
api_key: ${LANGSMITH_API_KEY}
api_url: ${LANGSMITH_ENDPOINT}
is_public: false
default_tags: [my_default_tag]

functions:
email_phishing_analyzer:
_type: email_phishing_analyzer
llm: phishing_llm
optimizable_params:
- prompt
prompt_init:
_type: prompt_init
optimizer_llm: prompt_optimizer
system_objective: Agent that triages an email to see if it is a phishing attempt or not.
prompt_recombination:
_type: prompt_recombiner
optimizer_llm: prompt_optimizer
system_objective: Agent that triages an email to see if it is a phishing attempt or not.

llms:
phishing_llm:
_type: nim
model_name: meta/llama-3.1-405b-instruct
temperature: 0.0
max_tokens: 1024
optimizable_params:
- temperature
- top_p
- max_tokens
- model_name
search_space:
model_name:
values:
- meta/llama-3.1-405b-instruct
- meta/llama-3.1-70b-instruct

prompt_optimizer:
_type: nim
model_name: meta/llama-3.1-70b-instruct
temperature: 0.5
max_tokens: 2048


workflow:
_type: react_agent
tool_names:
- email_phishing_analyzer
llm_name: prompt_optimizer
verbose: true
parse_agent_response_max_retries: 3

eval:
general:
output_dir: ./.tmp/eval/examples/evaluation_and_profiling/email_phishing_analyzer/test
verbose: true
dataset:
_type: csv
file_path: examples/evaluation_and_profiling/email_phishing_analyzer/data/smaller_test.csv
id_key: "subject"
structure:
question_key: body
answer_key: label

evaluators:
accuracy:
_type: ragas
metric: AnswerAccuracy
llm_name: prompt_optimizer
llm_latency:
_type: avg_llm_latency
token_efficiency:
_type: avg_tokens_per_llm_end

optimizer:
output_path: ./.tmp/examples/evaluation_and_profiling/email_phishing_analyzer/optimizer_test/
reps_per_param_set: 1
eval_metrics:
accuracy:
evaluator_name: accuracy
direction: maximize
token_efficiency:
evaluator_name: token_efficiency
direction: minimize
latency:
evaluator_name: llm_latency
direction: minimize

# Object store configuration for prompt storage
object_store:
name: test_store
key_prefix: prompt_opt_integration_test

# Disable numeric optimization for this test
numeric:
enabled: false

# Enable prompt optimization with minimal settings for quick test
prompt:
enabled: true
prompt_population_init_function: prompt_init
prompt_recombination_function: prompt_recombination
ga_generations: 2
ga_population_size: 2
ga_diversity_lambda: 0.3
ga_parallel_evaluations: 1
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
from collections.abc import Sequence
from typing import Any
from typing import Generic
from typing import Literal
from typing import TypeVar

import numpy as np
Expand All @@ -41,7 +42,7 @@ class SearchSpace(BaseModel, Generic[T]):
is_prompt: bool = False
prompt: str | None = None # prompt to optimize
prompt_purpose: str | None = None # purpose of the prompt

prompt_role: Literal["system", "human", "unknown"] = "unknown"
model_config = ConfigDict(protected_namespaces=(), extra="forbid")

@model_validator(mode="after")
Expand Down
16 changes: 16 additions & 0 deletions packages/nvidia_nat_core/src/nat/data_models/optimizer.py
Original file line number Diff line number Diff line change
Expand Up @@ -152,6 +152,19 @@ class PromptGAOptimizationConfig(BaseModel):
)


class ObjectStoreSettings(BaseModel):
"""
Settings for object store integration with optimizer.

Attributes:
name: Reference to object store name in config
key_prefix: Optional prefix for storage keys. If None,
auto-generates timestamp-based prefix.
"""
name: str = Field(description="Object store name from config")
key_prefix: str | None = Field(default=None, description="Optional key prefix. Auto-generated if None.")


class OptimizerConfig(BaseModel):
"""
Parameters used by the workflow optimizer.
Expand All @@ -161,6 +174,9 @@ class OptimizerConfig(BaseModel):
description="Path to the output directory where the results will be saved.",
)

object_store: ObjectStoreSettings | None = Field(
default=None, description="Optional object store for prompt checkpoints and final prompts")

eval_metrics: dict[str, OptimizerMetric] | None = Field(
description="List of evaluation metrics to optimize.",
default=None,
Expand Down
Loading