Use this guide alongside the documentation map. If you prefer hand-written Python scenarios, start with Writing Scenarios; this document focuses on YAML/JSON-driven workflows.
The No-Code Scenario Builder allows you to create AgentUnit test scenarios using simple YAML or JSON configuration files instead of writing Python code.
Create a file my_scenario.yaml:
name: My First Test
adapter:
type: openai
config:
model: gpt-3.5-turbo
temperature: 0.7
dataset:
cases:
- input: "What is 2+2?"
expected: "4"
- input: "What is the capital of France?"
expected: "Paris"
metrics:
- correctness
- latency
timeout: 30Then use it:
from agentunit.nocode import ScenarioBuilder
builder = ScenarioBuilder()
# Load scenario from YAML
scenario = builder.from_yaml("my_scenario.yaml")
# Or generate Python code
code = builder.to_python("my_scenario.yaml")
print(code)- name: Scenario name (string)
- adapter: Agent adapter configuration
- type: One of:
openai,langgraph,crewai,autogen,swarm,phidata,promptflow,custom - config: Adapter-specific configuration (object)
- type: One of:
- dataset: Test cases
- cases: Array of test cases (for inline datasets) OR
- source:
file,generator, orhuggingface - path: Path to dataset file or HuggingFace dataset ID
- metrics: Array of metric names or configurations
- timeout: Maximum execution time in seconds
- retries: Number of retry attempts
- tags: Array of tags for organization
- description: Human-readable description
dataset:
cases:
- input: "Question 1"
expected: "Answer 1"
context: "Additional context"
metadata:
difficulty: "easy"
- input: "Question 2"
expected: "Answer 2"dataset:
source: file
path: "./data/test_cases.json"dataset:
source: huggingface
path: "squad"
split: "validation"
limit: 100dataset:
source: generator
generator:
type: llm
config:
num_cases: 10
prompt: "Generate Q&A pairs about Python"Common metric names you can use:
correctnesslatencyfaithfulnessanswer_relevancycontext_recallcontext_precisionexact_matchf1_scorepii_detectiondata_leakage
Custom metrics:
metrics:
- name: custom_metric
threshold: 0.8
weight: 1.0adapter:
type: openai
config:
model: gpt-4
temperature: 0.7
max_tokens: 500adapter:
type: langgraph
path: "./my_graph.py"
config:
model: gpt-4adapter:
type: crewai
config:
model: gpt-4
max_turns: 5adapter:
type: custom
path: "./my_adapter.py"
config:
# Your custom configurationPre-built templates for common scenarios:
from agentunit.nocode import TemplateLibrary
library = TemplateLibrary()
# List available templates
templates = library.list_templates()
for template in templates:
print(f"{template.name}: {template.description}")
# Get a specific template
template = library.get_template("basic_qa")
# Apply template with customizations
config = library.apply_template(
"basic_qa",
name="My Custom Q&A Test",
adapter={
"config": {
"model": "gpt-4",
"temperature": 0.0,
}
},
timeout=60,
)
# Save customized config
import yaml
with open("my_test.yaml", "w") as f:
yaml.dump(config, f)- basic_qa: Basic question-answering scenario
- rag_evaluation: Retrieval-Augmented Generation evaluation
- agent_interaction: Multi-turn agent conversation testing
- benchmark_test: Standardized benchmark evaluation
- cost_optimization: Cost-optimized scenario with fallback models
- privacy_test: Privacy and PII detection testing
Convert between YAML, JSON, and Python:
from agentunit.nocode import ConfigConverter, ConversionFormat
converter = ConfigConverter()
# YAML to JSON
converter.convert("scenario.yaml", ConversionFormat.JSON, "scenario.json")
# JSON to Python code
code = converter.convert_to_python("scenario.json")
print(code)
# YAML to Python
python_file = converter.convert("scenario.yaml", ConversionFormat.PYTHON, "scenario.py")Validate configurations before use:
from agentunit.nocode import SchemaValidator
validator = SchemaValidator()
# Validate file
result = validator.validate_file("my_scenario.yaml")
if not result.valid:
print("Validation errors:")
for error in result.errors:
print(f" {error.path}: {error.message}")
else:
print("✓ Configuration is valid")
# Check warnings
if result.warnings:
print("Warnings:")
for warning in result.warnings:
print(f" {warning}")Load all scenarios from a directory:
from agentunit.nocode import ScenarioBuilder
builder = ScenarioBuilder()
# Load all YAML files from directory
scenarios = builder.from_directory(
"scenarios/",
pattern="*.yaml"
)
print(f"Loaded {len(scenarios)} scenarios")name: Customer Support Agent Test
description: Testing a customer support chatbot
tags:
- support
- chatbot
- production
adapter:
type: openai
config:
model: gpt-4
temperature: 0.7
max_tokens: 500
timeout: 30
dataset:
cases:
- input: "How do I reset my password?"
expected: "Click on 'Forgot Password' link on the login page"
metadata:
category: "account"
priority: "high"
- input: "What are your business hours?"
expected: "We're open Monday-Friday, 9 AM - 5 PM EST"
metadata:
category: "info"
priority: "low"
- input: "My account is locked"
expected: "I'll help you unlock it. Please verify your email"
metadata:
category: "account"
priority: "high"
metrics:
- correctness
- latency
- helpfulness
retries: 2
timeout: 60from agentunit.nocode import ScenarioBuilder
from agentunit import run_suite
# Load scenario
builder = ScenarioBuilder()
scenario = builder.from_yaml("scenario.yaml")
# Run tests
results = run_suite([scenario])
# Print results
print(f"Success rate: {results.success_rate:.1%}")- Use Templates: Start with a template and customize it
- Validate Early: Always validate configs before running
- Organize by Directory: Group related scenarios in folders
- Version Control: Keep scenario configs in git
- Use Metadata: Add metadata to track test categories/priorities
- Set Timeouts: Always specify reasonable timeouts
- Test Incrementally: Start with small datasets, then scale up
If you get validation errors, check:
- Required fields are present (name, adapter, dataset)
- Adapter type is one of the supported types
- Dataset has either
casesorsourcefield - Metric names are valid or properly structured
If you get import errors:
- Ensure jsonschema is installed:
pip install jsonschema - Ensure PyYAML is installed:
pip install pyyaml
If adapter creation fails:
- Provide adapter instance directly to builder
- Or use code generation and implement adapter separately
- Check adapter type spelling
- Browse available templates:
TemplateLibrary().list_templates() - Create your first scenario from a template
- Customize it for your use case
- Validate the configuration
- Run the scenario and iterate
For more information, see the full API documentation.