Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
64 changes: 25 additions & 39 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,8 @@ OpenAI-compatible mock server for testing LLM integrations.
- OpenAI API compatibility with key endpoints (`/models`, `/chat/completions`, `/responses`)
- Configurable mock responses via strategies
- Default mirror strategy (echoes input as output)
- **Tool calling support** — config-driven tool call responses when `tools` are present in the request
- **Error message simulation** — config-driven error responses triggered by specific message content
- **Tool calling support** — trigger phrase–driven tool call responses when `tools` are present in the request using `call tool '<name>' with '<json>'`
- **Error simulation** — trigger phrase–driven error responses using `raise error <json>` in the last user message
- Streaming support for both Chat Completions and Responses APIs (including `stream_options.include_usage`)

## Quick Start
Expand Down Expand Up @@ -101,7 +101,7 @@ When `ToolCallStrategy` is included in the `strategies` list, llmock watches the
call tool '<name>' with '<json>'
```

- `<name>` must match one of the tools declared in `tools`.
- `<name>` is used verbatim — no check against the `tools` list in the request is performed.
- `<json>` is the arguments string passed to the tool (use `'{}'` for no arguments).
- Multiple matching lines produce multiple tool calls.
- If no line matches, the strategy falls through to the next one (e.g. `MirrorStrategy`).
Expand Down Expand Up @@ -171,27 +171,31 @@ function_call = response.output[0]
# function_call.arguments == '{"expression": "6*7"}' (from trigger phrase)
```

### Error Message Simulation
### Error Simulation

Error responses are configured in `config.yaml` under the `error-messages` section. When a request's last user message matches a key in that section exactly, the server returns the configured error response instead of a normal completion.
When `ErrorStrategy` is included in the `strategies` list, llmock watches the last user message for lines matching the pattern:

Default configuration:
```
raise error <json>
```

The JSON payload must contain:
- `code` (integer) — HTTP status code to return
- `message` (string) — error message
- `type` (string, optional) — OpenAI error type (e.g. `"rate_limit_error"`)
- `error_code` (string, optional) — OpenAI error code (e.g. `"rate_limit_exceeded"`)

| Message Content | HTTP Status | Error Type | Message |
|------------------|-------------|-------------------------|------------------------|
| `trigger-401` | 401 | `authentication_error` | Invalid API key |
| `trigger-429` | 429 | `rate_limit_error` | Rate limit exceeded |
| `trigger-500` | 500 | `server_error` | Internal server error |
The first matching line wins. If no line matches, the strategy falls through to the next one.

You can add custom error triggers by extending the `error-messages` section:
No extra config keys are needed — adding `ErrorStrategy` to the `strategies` list is sufficient.

#### Configuration

```yaml
error-messages:
"Hi":
status-code: 403
message: "Forbidden"
type: "forbidden_error"
code: "forbidden"
strategies:
- ErrorStrategy
- ToolCallStrategy
- MirrorStrategy
```

```python
Expand All @@ -202,7 +206,7 @@ client = OpenAI(base_url="http://localhost:8000", api_key="mock-key")
try:
client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "trigger-429"}]
messages=[{"role": "user", "content": 'raise error {"code": 429, "message": "Rate limit exceeded", "type": "rate_limit_error", "error_code": "rate_limit_exceeded"}'}]
)
except APIStatusError as e:
print(e.status_code) # 429
Expand All @@ -214,7 +218,7 @@ Works on both `/chat/completions` and `/responses` endpoints.

## Configuration

Edit `config.yaml` to configure available models, response strategies, error messages, and tool call responses:
Edit `config.yaml` to configure available models and response strategies:

```yaml
# Ordered list of strategies to try (first non-empty result wins)
Expand All @@ -231,24 +235,7 @@ models:
- id: "gpt-4o-mini"
created: 1721172741
owned_by: "openai"

# Config-driven error responses (triggered by message content)
error-messages:
"trigger-401":
status-code: 401
message: "Invalid API key"
type: "authentication_error"
code: "invalid_api_key"
"trigger-429":
status-code: 429
message: "Rate limit exceeded"
type: "rate_limit_error"
code: "rate_limit_exceeded"
"trigger-500":
status-code: 500
message: "Internal server error"
type: "server_error"
code: "internal_error"
```

### Environment Variable Overrides

Expand All @@ -268,7 +255,6 @@ export LLMOCK_CORS_ALLOW_ORIGINS="http://localhost:8000;http://localhost:5173"
Notes:
- Lists are parsed from semicolon-separated values.
- Only keys that exist in `config.yaml` are overridden.
```

## Development

Expand Down
12 changes: 0 additions & 12 deletions config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -21,15 +21,3 @@ models:
created: 1677610602
owned_by: "openai"

# Ordered list of strategies to try. The first strategy that returns a
# non-empty result wins. At least one must be set.
#
# Available strategies:
# MirrorStrategy - echoes back the last user message
# ToolCallStrategy - triggered by `call tool '<name>' with '<json>'` phrase in last user message
# ErrorStrategy - triggered by `raise error <json>` phrase in last user message
#
strategies:
- ErrorStrategy
- ToolCallStrategy
- MirrorStrategy
6 changes: 3 additions & 3 deletions docs/ARCHITECTURE.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ ResponseStrategy {
- `ResponseMirrorStrategy`: Extract last user input → return `[StrategyResponse(type="text", content=...)]`

**Tool Call Strategies** (trigger phrase–driven):
- `ChatToolCallStrategy` (Chat Completions): Parses the **last user message** line-by-line for the pattern `call tool '<name>' with '<json>'`. Each matching line whose `<name>` appears in `request.tools` produces a `StrategyResponse(type="tool_call")` with the extracted JSON arguments. Multiple matching lines produce multiple responses. If no line matches, returns an empty list (falls through to the next strategy). No config keys required.
- `ChatToolCallStrategy` (Chat Completions): Parses the **last user message** line-by-line for the pattern `call tool '<name>' with '<json>'`. Every matching line produces a `StrategyResponse(type="tool_call")` with the extracted name and JSON arguments — no validation against `request.tools` is performed. Multiple matching lines produce multiple responses. If no line matches, returns an empty list (falls through to the next strategy). No config keys required.
- `ResponseToolCallStrategy` (Responses API): Same trigger-phrase logic but operates on `ResponseCreateRequest` inputs (string or structured message list).
- Both support streaming and non-streaming modes.

Expand All @@ -80,7 +80,7 @@ ResponseStrategy {
**Composition Strategy** (used by routers):
- `ChatCompositionStrategy` / `ResponseCompositionStrategy` — reads the `strategies` list from config, creates each sub-strategy via the factory registry, and runs them in order.
- The first strategy that returns a **non-empty** `list[StrategyResponse]` wins; remaining strategies are not called.
- Default when `strategies` is missing: `["MirrorStrategy"]`.
- Default when `strategies` is missing: `["ErrorStrategy", "ToolCallStrategy", "MirrorStrategy"]`.
- Unknown strategy names are skipped with a warning.
- Not registered in the factory — it **wraps** the factory internally.
- Both routers (`/chat/completions` and `/responses`) instantiate the composition strategy directly.
Expand Down Expand Up @@ -116,7 +116,7 @@ models:

The `strategies` field is an ordered list of strategy names to try. The composition strategy runs them in order; the first one that returns a non-empty result wins. If omitted, `["MirrorStrategy"]` is the default.

`ToolCallStrategy` fires when the last user message contains a line matching `call tool '<name>' with '<json>'` and `<name>` is present in `request.tools`.
`ToolCallStrategy` fires when the last user message contains a line matching `call tool '<name>' with '<json>'`. The `<name>` is used verbatim — no check against `request.tools` is performed.

`ErrorStrategy` fires when the last user message contains a line matching `raise error <json>`, where `<json>` has at least `code` (int) and `message` (str).

Expand Down
4 changes: 2 additions & 2 deletions docs/llmock-skill/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -174,10 +174,10 @@ When `ToolCallStrategy` is in the strategies list, llmock scans the last user me
call tool '<name>' with '<json>'
```

- `<name>` must match one of the tools declared in the request's `tools` list.
- `<name>` is used verbatim — no check against the request's `tools` list is performed.
- `<json>` is the arguments string passed back as the tool call arguments (use `'{}'` for no arguments).
- Multiple matching lines each produce a separate tool call response.
- If no line matches, or the named tool is not in `request.tools`, the strategy returns an empty list and the next strategy runs.
- If no line matches, the strategy returns an empty list and the next strategy runs.

### Error Simulation

Expand Down
12 changes: 0 additions & 12 deletions docs/llmock-skill/references/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -21,15 +21,3 @@ models:
created: 1677610602
owned_by: "openai"

# Ordered list of strategies to try. The first strategy that returns a
# non-empty result wins. At least one must be set.
#
# Available strategies:
# MirrorStrategy - echoes back the last user message
# ToolCallStrategy - triggered by `call tool '<name>' with '<json>'` phrase in last user message
# ErrorStrategy - triggered by `raise error <json>` phrase in last user message
#
strategies:
- ErrorStrategy
- ToolCallStrategy
- MirrorStrategy
5 changes: 3 additions & 2 deletions src/llmock/strategies/strategy_composition.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,8 @@
are tried in order; the first one that returns a **non-empty** list wins and
its result is returned immediately. Remaining strategies are not called.

If ``strategies`` is missing from config, defaults to ``["MirrorStrategy"]``.
If ``strategies`` is missing from config, defaults to
``["ErrorStrategy", "ToolCallStrategy", "MirrorStrategy"]``.

This strategy is **not** registered in the factory — it wraps the factory
internally and is the top-level strategy instantiated by the routers.
Expand All @@ -21,7 +22,7 @@

logger = logging.getLogger(__name__)

_DEFAULT_STRATEGIES = ["MirrorStrategy"]
_DEFAULT_STRATEGIES = ["ErrorStrategy", "ToolCallStrategy", "MirrorStrategy"]


class ChatCompositionStrategy:
Expand Down
65 changes: 14 additions & 51 deletions src/llmock/strategies/strategy_tool_call.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,12 @@

call tool '<name>' with '<json>'

- ``<name>`` must match one of the tools declared in ``request.tools``.
- ``<name>`` is used verbatim — no check against ``request.tools`` is made.
- ``<json>`` must be a valid JSON string (may be empty, treated as ``{}``).

Each matching line produces one :class:`~llmock.strategies.base.StrategyResponse`
of type ``TOOL_CALL``. If no lines match the pattern (or the named tool is
not in the request), an empty list is returned and the next strategy in the
composition chain runs.
of type ``TOOL_CALL``. If no lines match the pattern, an empty list is
returned and the next strategy in the composition chain runs.

No configuration keys are required. Adding ``ToolCallStrategy`` to the
``strategies`` list in ``config.yaml`` is sufficient:
Expand All @@ -22,7 +21,6 @@
- MirrorStrategy
"""

import logging
import re
from typing import Any

Expand All @@ -34,8 +32,6 @@
extract_last_user_text_response,
)

logger = logging.getLogger(__name__)

# Matches: call tool '<name>' with '<args>'
_TRIGGER_RE = re.compile(r"call tool '([^']+)' with '([^']*)'")

Expand All @@ -45,49 +41,21 @@
# ---------------------------------------------------------------------------


def _tool_names_from_chat(request: ChatCompletionRequest) -> set[str]:
"""Return the set of tool function names declared in a Chat request."""
if not request.tools:
return set()
names: set[str] = set()
for tool in request.tools:
name = tool.get("function", {}).get("name")
if name:
names.add(name)
return names


def _tool_names_from_response(request: ResponseCreateRequest) -> set[str]:
"""Return the set of tool names declared in a Responses request."""
if not request.tools:
return set()
names: set[str] = set()
for tool in request.tools:
name = tool.get("name")
if name:
names.add(name)
return names


def _parse_triggers(text: str, available_tools: set[str]) -> list[StrategyResponse]:
def _parse_triggers(text: str) -> list[StrategyResponse]:
"""Scan *text* line-by-line and return tool responses for every match.

Each line is tested against ``_TRIGGER_RE``. A match is accepted when:
Each line is tested against ``_TRIGGER_RE``. A match produces a
``TOOL_CALL`` response using the extracted name and arguments verbatim;
no check against ``request.tools`` is performed.

1. The extracted tool ``<name>`` appears in ``available_tools``.
2. The extracted ``<json>`` (or ``{}`` when empty) is valid JSON.

Lines that do not match or fail either check are silently skipped.
Lines that do not match or contain invalid JSON args are silently skipped.
"""
responses: list[StrategyResponse] = []
for line in text.splitlines():
m = _TRIGGER_RE.search(line)
if not m:
continue
name, args_str = m.group(1), m.group(2)
if name not in available_tools:
logger.debug("Trigger tool '%s' not in request.tools — skipped", name)
continue
effective_args = args_str if args_str else "{}"
responses.append(tool_response(effective_args, name))
return responses
Expand All @@ -102,8 +70,8 @@ class ChatToolCallStrategy:
"""Trigger phrase–driven tool call strategy for Chat Completions API.

Parses the last user message for ``call tool '<name>' with '<json>'``
lines. Each matching line whose tool name appears in ``request.tools``
generates a ``TOOL_CALL`` strategy response.
lines. Each matching line generates a ``TOOL_CALL`` strategy response;
no validation against ``request.tools`` is performed.

No configuration is consumed.
"""
Expand All @@ -121,9 +89,6 @@ def generate_response(
cycle of an agentic loop where the original trigger still sits earlier
in the conversation history.
"""
available = _tool_names_from_chat(request)
if not available:
return []
# Only process the trigger when the last non-system turn is the user's.
last_role = next(
(msg.role for msg in reversed(request.messages) if msg.role != "system"),
Expand All @@ -134,14 +99,15 @@ def generate_response(
text = extract_last_user_text_chat(request)
if text is None:
return []
return _parse_triggers(text, available)
return _parse_triggers(text)


class ResponseToolCallStrategy:
"""Trigger phrase–driven tool call strategy for the Responses API.

Same behaviour as :class:`ChatToolCallStrategy` but operates on
:class:`~llmock.schemas.responses.ResponseCreateRequest` objects.
:class:`~llmock.schemas.responses.ResponseCreateRequest` objects. No
validation against ``request.tools`` is performed.

No configuration is consumed.
"""
Expand All @@ -161,9 +127,6 @@ def generate_response(
When ``request.input`` is a plain string it is always treated as a
fresh user turn.
"""
available = _tool_names_from_response(request)
if not available:
return []
# A plain string is always a fresh user turn — proceed normally.
if not isinstance(request.input, str):
# For a list input, the last item must be a user-role message.
Expand All @@ -177,4 +140,4 @@ def generate_response(
text = extract_last_user_text_response(request)
if text is None:
return []
return _parse_triggers(text, available)
return _parse_triggers(text)
Loading