Skip to content

Spike Claude Agent SDK chat runner#42

Open
SakshiKekre wants to merge 11 commits intoPolicyEngine:mainfrom
SakshiKekre:codex/claude-agent-sdk-spike
Open

Spike Claude Agent SDK chat runner#42
SakshiKekre wants to merge 11 commits intoPolicyEngine:mainfrom
SakshiKekre:codex/claude-agent-sdk-spike

Conversation

@SakshiKekre
Copy link
Copy Markdown

Summary

  • Adds an opt-in Claude Agent SDK runner behind POLICYENGINE_CHAT_AGENT_RUNNER=claude_sdk
  • Reuses the model backend registry from PR Abstract chat simulation backends #41 and exposes run_python through an in-process MCP tool
  • Carries plan mode through the request/UI so the SDK workflow can be compared against the direct Anthropic loop

Review note

This PR is opened against main for visibility in the PolicyEngine repo dashboard. It is stacked on top of PR #41, so the diff includes the model-backend abstraction work plus the SDK spike. For a focused SDK-only diff, compare codex/model-backend-abstraction...codex/claude-agent-sdk-spike or use the fork PR: SakshiKekre#1

Notes

This is intended as a spike/comparison PR, not a replacement recommendation yet. The existing direct Anthropic Messages loop remains the default.

Validation

  • cd backend && ../.venv/bin/python -m py_compile routes/chatbot.py claude_agent_sdk_runner.py model_backends.py agent_tools.py
  • cd backend && ../.venv/bin/python -m pytest tests/test_agent_tools.py
  • cd backend && ../.venv/bin/python -m pytest tests/test_api.py::TestChatBackends::test_lists_backends tests/test_api.py::TestChatMessage::test_unknown_model_backend_returns_400
  • cd frontend && npm run build

Manual test

Set POLICYENGINE_CHAT_AGENT_RUNNER=claude_sdk with ANTHROPIC_API_KEY and run the backend to compare SDK streaming/tool behavior.

@vercel
Copy link
Copy Markdown

vercel Bot commented May 6, 2026

@SakshiKekre is attempting to deploy a commit to the PolicyEngine Team on Vercel.

A member of the Team first needs to authorize it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant