Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
55 changes: 55 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,60 @@
# Changelog

## 0.3.0

### aiohttp interception (soft dependency)

- `aiohttp.ClientSession._request` is now patched when `aiohttp` is importable.
The intercept SDK does **not** add `aiohttp` as a hard dependency — the patch
installs only when the user's environment already has it.
- This unlocks **LiteLLM** (which uses `aiohttp` as its default transport since
v1.71+) and any framework that opts into an `aiohttp` extra (Google GenAI,
Google ADK, etc.).
- Body override (the simulation tamper hook) is not supported for
`aiohttp.ClientResponse` — recording fires in full but the response is
returned unchanged.

### OpenAI Agents SDK integration

- Added `examples/openai_agents/` — a runnable end-to-end demo that drives a
real OpenRouter model call through the full Provably intercept → handoff →
evaluate pipeline. Model: `openai/gpt-4o-mini` (~$0.001/run); data API:
Open-Meteo (no auth required).
- Added `tests/e2e/test_openai_agents_e2e.py` — six deterministic scenarios
(A–F) using in-process `FakeHttpServer`s; zero network egress in CI.

### Broader HTTP interception surface

- `httpx.Client.send`, `httpx.AsyncClient.send`, and `requests.Session.send`
are now patched in addition to the existing module-level shortcuts
(`httpx.get`, `httpx.post`, `requests.get`, `requests.post`). This means
every outbound HTTP call from any framework — including the async agent loops
used by the OpenAI Agents SDK — is intercepted without any user-side changes.
- A re-entry contextvar guard (`_reentry.already_recording`) prevents
double-recording when a module-level call (e.g. `httpx.get`) internally
delegates to the newly-patched `Client.send`.

### Trust gate fires on all HTTP methods (BREAKING-ISH)

- **Before this release** `_require_trusted_endpoint` was only called for GET
requests. It now fires unconditionally for every method (POST, PUT, PATCH,
DELETE, etc.).
- **Migration:** register every outbound URL — including your LLM provider
(e.g. `https://openrouter.ai/api/v1/chat/completions`) — in `trusted_endpoints`
before running an agent. Use `INSERT ... ON CONFLICT DO NOTHING` or the
Provably dashboard to add rows. See `examples/openai_agents/agent_run.py`
for the pattern.

### New `provably_self_egress()` context manager

- Added `provably.intercept.provably_self_egress()` — a context manager that
marks a block of code as SDK-internal egress. Inside it, the trust gate is
bypassed and no intercept rows are written. All SDK self-egress sites
(`handoff.transport`, `handoff.evaluator`, `handoff._bootstrap`) already wrap
their own HTTP calls in this context, so the SDK never trips its own gate.
Advanced users who make their own Provably API calls from within an agent loop
can use this to avoid BLOCKED errors.

## 0.2.0

- Added `provably.configure_indexing(enable_indexing: bool)`: one-call bootstrap (`initialize_runtime` + `init_interceptor` + `enable` / `disable`) for sender agents.
Expand Down
2 changes: 1 addition & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ COPY --from=builder /dist /dist

RUN pip install --upgrade pip \
&& pip install /dist/*.whl \
&& pip install "pytest>=8.0" "ruff>=0.3" "build>=1.2"
&& pip install "pytest>=8.0" "pytest-asyncio>=0.23" "ruff>=0.3" "build>=1.2" "openai-agents" "aiohttp>=3.9"

COPY pyproject.toml ./
COPY tests ./tests
Expand Down
45 changes: 43 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ to at all — is enforced before the request leaves the process.
## Contents

- [What it does](#what-it-does)
- [Framework coverage](#framework-coverage)
- [Install](#install)
- [Quick start](#quick-start)
- [Configuration](#configuration)
Expand Down Expand Up @@ -60,8 +61,8 @@ flowchart LR

The flow, in order:

1. **Intercept + Police** — every outbound `requests` / `httpx` call goes
through the SDK's monkey-patched HTTP path. _Inside_ the interceptor, before
1. **Intercept + Police** — every outbound `requests` / `httpx` / `aiohttp`
call goes through the SDK's monkey-patched HTTP path. _Inside_ the interceptor, before
the request leaves the process, the URL is checked against the
`trusted_endpoints` table. If the URL is not registered the call is killed
with `RuntimeError("BLOCKED: ...")` and never reaches the network.
Expand Down Expand Up @@ -91,6 +92,46 @@ Nothing in this loop relies on a model self-evaluating its own output.
| Eval service | **You** — any HTTP service that calls `provably.evaluate_handoff(...)` on the incoming payload. | The SDK gives you the function; you decide where to host it. |
| Provably query record | **Provably** — fetched over HTTPS by the eval service using the `integration_api_key` from the handoff payload. | This is the source of truth the evaluator compares each claim against. |

## Framework coverage

The interceptor patches the central HTTP transport choke points, so coverage of
agent frameworks follows automatically from which library a framework uses
under the hood. As of v0.3.0:

**Transport patches**

| Transport | Patched at |
| --- | --- |
| `requests` | module-level `get`/`post` + `Session.send` |
| `httpx` | module-level `get`/`post` + `Client.send` + `AsyncClient.send` |
| `aiohttp` | `ClientSession._request` (soft dep — patches only when `aiohttp` is importable) |
| `botocore` / `urllib3` | _pending_ — see [issue #10](https://github.com/ProvablyAI/provably-python-sdk/issues/10) |

**Agent / LLM frameworks**

| Framework | Status | Notes |
| --- | --- | --- |
| OpenAI SDK | ✅ | httpx |
| Anthropic SDK | ✅ | httpx |
| Pydantic AI | ✅ | delegates to AsyncOpenAI / AsyncAnthropic |
| LangChain | ✅ | delegates to provider SDKs |
| LangGraph | ✅ | same |
| LlamaIndex | ✅ | httpx via OpenAI SDK |
| AutoGen | ✅ | AsyncOpenAI |
| Haystack | ✅ | migrated to httpx (2024–25) |
| Phidata / Agno | ✅ | AsyncOpenAI / `httpx[http2]` |
| OpenAI Agents SDK | ✅ | httpx; e2e suite at [tests/e2e/test_openai_agents_e2e.py](tests/e2e/test_openai_agents_e2e.py); demo at [examples/openai_agents/](examples/openai_agents/) |
| Google GenAI | ✅ | httpx default + optional `aiohttp` extra |
| LiteLLM | ✅ | aiohttp transport (default since v1.71) |
| DSPy | ✅ | LiteLLM only |
| smolagents | ✅ | OpenAI SDK / HF / LiteLLM paths covered |
| CrewAI | ⚠️ | OpenAI/Anthropic ✅, LiteLLM fallback ✅, **Bedrock provider ❌** (boto3) |
| AWS Strands | ❌ | boto3/botocore → urllib3; tracked in [issue #10](https://github.com/ProvablyAI/provably-python-sdk/issues/10) |

**Out of scope for the HTTP interception layer** (separate shipping units):
MCP servers, in-process LLMs (`transformers`, `mlx_lm`), gRPC (Google ADK
A2A), websockets, raw sockets.

## Install

> **Status:** v0.2 — not yet published to PyPI. Install from source.
Expand Down
100 changes: 100 additions & 0 deletions examples/openai_agents/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
# OpenAI Agents SDK + Provably — Runnable Demo

This demo shows an end-to-end run of the Provably SDK integrated with the
[OpenAI Agents SDK](https://github.com/openai/openai-agents-python) (>=0.0.3).
It exercises every pillar of the SDK in a single script:

1. **Intercept** — `configure_indexing(True)` installs monkey-patches on
`httpx.AsyncClient.send` and `requests.Session.send` so every outbound HTTP
request from the agent loop is captured and stored in `provably_intercepts`.
2. **Trust gate** — before storing a request the SDK checks that its URL is
registered in `trusted_endpoints`. The demo seeds both the OpenRouter
chat-completions URL and the Open-Meteo weather URL before running the agent.
3. **Tool call** — the agent uses a `@function_tool` that calls the free
[Open-Meteo API](https://open-meteo.com/) (no API key required) to fetch the
current temperature in London.
4. **Handoff** — the captured intercept row id is wrapped in a `HandoffPayload`
with one `HandoffClaim` asserting the tool output.
5. **Evaluate** — `evaluate_handoff()` fetches the stored query record from the
Provably backend, compares it to the claimed value, and prints the verdict.

Expected output (abbreviated):

```json
{
"outcome": "PASS",
"per_claim": [
{
"action_name": "get_weather",
"result": "PASS",
"proof_time_ms": 42,
"verify_time_ms": 137
}
],
"errors": []
}
```

## Required environment variables

| Variable | Required | Notes |
|---|---|---|
| `OPENROUTER_API_KEY` | yes | API key for [OpenRouter](https://openrouter.ai/). Used for the model call (`openai/gpt-4o-mini`). |
| `PROVABLY_API_KEY` | yes | Provably integration API key. |
| `PROVABLY_ORG_ID` | yes | Provably organisation id. Scopes trusted-endpoint and query-record lookups. |
| `PROVABLY_RUST_BE_URL` | yes | Base URL of the Provably Rust backend (e.g. `https://api.provably.ai`). |
| `POSTGRES_URL` | yes | PostgreSQL DSN (e.g. `postgresql://user:pass@host/db`). Used for intercept storage and trusted-endpoint registry. |

## How to run

```bash
# 1. Install the SDK in editable mode with dev extras (includes openai-agents)
pip install -e .[dev]

# 2. Export the required env vars
export OPENROUTER_API_KEY="sk-or-..."
export PROVABLY_API_KEY="prov_..."
export PROVABLY_ORG_ID="org_..."
export PROVABLY_RUST_BE_URL="https://api.provably.ai"
export POSTGRES_URL="postgresql://user:pass@localhost/provably"

# 3. Run the demo
python examples/openai_agents/agent_run.py
```

## Model and cost

The demo uses **`openai/gpt-4o-mini`** on OpenRouter — a cheap, capable model
that reliably follows tool-calling instructions. Estimated cost is approximately
**$0.001 per run** (one tool call + one summary turn).

## How the trust gate works — and what happens when you forget to seed it

The Provably SDK now enforces trust on **all HTTP methods** (GET, POST, etc.),
not only GET. This means the LLM provider call (a POST to OpenRouter) *and* the
weather API call (a GET to Open-Meteo) both need to be registered in
`trusted_endpoints` before the agent runs.

If you forget to seed an endpoint, the SDK raises:

```
RuntimeError: BLOCKED: endpoint https://openrouter.ai/api/v1/chat/completions not in trusted index for org <org_id>
```

When this error occurs inside `httpx.AsyncClient.send` (the async LLM call), the
OpenAI SDK wraps it in an `APIConnectionError`. You can inspect the full
exception chain to find the original `BLOCKED: ...` message.

**Migration note for existing users:** if you were previously relying on the SDK
only trust-checking GET requests, you must now register *all* outbound URLs
including your LLM provider URL. Use the `seed_trusted_endpoints` helper pattern
shown in this demo (raw psycopg2 `INSERT ... ON CONFLICT DO NOTHING`), or add
rows via the Provably dashboard.

## How `provably_self_egress()` relates to this demo

The Provably SDK's own HTTP calls (fetching query records, posting verify
requests, bootstrap handshakes) are **never** blocked by the trust gate. They
run inside `with provably_self_egress():` context managers that mark them as
SDK-internal egress, so the trust gate is bypassed automatically. You do not
need to add Provably's own backend URL to `trusted_endpoints`.
Loading
Loading