Your MCP servers have blind spots. Your agents already know what they are.
PatchworkMCP adds a feedback tool to any MCP server. When an agent hits a gap — missing tool, wrong format, incomplete data — it tells you exactly what it needed and how to fix it. Then it drafts the PR.
No guessing. No feature request backlogs. Just structured signal from the agents that actually use your tools.
We added PatchworkMCP to an AI Cost Manager MCP server. Within one session, Claude reported:
Gap:
missing_tool
What it needed: A tool to search cost events by context field values (e.g., run_id, session_id)
What it tried:get_costs,get_usage_events— neither supported context-based filtering
Suggestion: "Asearch_costs_by_contexttool that accepts context key-value pairs with AND logic, combined with standard date/service/customer filters. Returns paginated event records with full context."
That's not a vague complaint. That's a tool spec. PatchworkMCP can take that feedback, read your repo, and open a draft PR with the implementation.
The MCP ecosystem has a "more builders than users" problem — Gergely Orosz and team lay it out well in their deep dive. Developers ship MCP servers and don't know what agents actually need from them. The best practice is to design for agents, not humans, but agents can't tell you what's missing unless you give them a way to.
That's what PatchworkMCP does. It's built for active MCP server development — both public servers and the internal ones that make up the majority of real MCP adoption. Whether you're building tools for Claude Desktop users or wrapping an internal data warehouse for your team, the feedback loop is the same: wire it up, let agents use your server, see exactly where they hit walls, and draft fixes in under a minute.
This is the MVP. The bigger picture is below in Where This Is Going.
Agent hits a wall Feedback captured You review + ship
┌──────────────┐ POST ┌──────────────┐ ┌──────────────┐
│ MCP Server │ ────────▶ │ Sidecar │ ─────▶ │ Dashboard │
│ + feedback │ │ SQLite │ │ Draft PR │
│ tool │ │ FastAPI │ │ one click │
└──────────────┘ └──────────────┘ └──────────────┘
- Copy a single file into your MCP server (Python, TypeScript, Go, or Rust)
- Agents call the feedback tool when they can't do what the user asked
- Browse
localhost:8099to review feedback, add notes, spot patterns - Click Draft PR — PatchworkMCP reads your repo, sends the feedback + code context to an LLM, and opens a draft pull request with the suggested fix
30 seconds to running:
git clone https://github.com/keyton-weissinger/patchworkmcp.git
cd patchworkmcp
uv run server.pyOpen http://localhost:8099. That's the sidecar — it stores feedback and serves the dashboard.
Now wire up your MCP server. Pick your language, copy one file:
Python (FastMCP) — 2 lines
Copy drop-ins/python/feedback_tool.py into your project.
uv add httpx # or: pip install httpxfrom mcp.server.fastmcp import FastMCP
from feedback_tool import register_feedback_tool
server = FastMCP("my-server")
register_feedback_tool(server, "my-server")Python (Django MCP)
Copy drop-ins/python/feedback_tool.py into your tools directory.
uv add httpxfrom mcp.tools.registry import mcp_tool
from feedback_tool import TOOL_NAME, TOOL_DESCRIPTION, TOOL_INPUT_SCHEMA, send_feedback_sync
@mcp_tool(name=TOOL_NAME, description=TOOL_DESCRIPTION, input_schema=TOOL_INPUT_SCHEMA)
def feedback(credential, arguments):
return send_feedback_sync(arguments, server_name="my-server")Python (Raw MCP SDK)
Copy drop-ins/python/feedback_tool.py into your project.
from feedback_tool import get_tool_definition, send_feedback
# In your list_tools handler:
tools.append(get_tool_definition())
# In your call_tool handler:
if name == "feedback":
result = await send_feedback(arguments, server_name="my-server")TypeScript
Copy drop-ins/typescript/feedback-tool.ts into your project. No extra dependencies — uses built-in fetch (Node 18+).
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { registerFeedbackTool } from "./feedback-tool.js";
const server = new McpServer({ name: "my-server", version: "1.0.0" });
registerFeedbackTool(server, "my-server");Go
Copy drop-ins/go/feedback_tool.go into your project. Only depends on github.com/mark3labs/mcp-go and stdlib.
import "your-project/feedback"
s := server.NewMCPServer("my-server", "1.0.0")
feedback.RegisterFeedbackTool(s, "my-server")Rust
Copy drop-ins/rust/feedback_tool.rs into your project.
[dependencies]
reqwest = { version = "0.12", features = ["json"] }
serde = { version = "1", features = ["derive"] }
serde_json = "1"use feedback_tool::{payload_from_args, send_feedback, TOOL_NAME, TOOL_DESCRIPTION};
let payload = payload_from_args(&args, "my-server");
let message = send_feedback(&payload).await;Test it: Use your MCP server via Claude Desktop, Cursor, Claude Code, etc. Ask the agent to do something the server can't handle. Check http://localhost:8099 — you'll see what it reported.
This is the payoff. Click Draft PR on any feedback card and PatchworkMCP will:
- Read your repo's file tree via GitHub API
- Score files by MCP relevance and select the most important ones
- Send the feedback + your notes + code context to the LLM with structured output enforcement
- Create a branch, commit the change, and open a draft PR
You get a real-time progress modal showing each step. The PR links back to the original feedback.
Configure your GitHub PAT, repo, and LLM provider in the dashboard settings panel:
| Provider | Default Model |
|---|---|
| Anthropic | claude-opus-4-6 |
| OpenAI | GPT-5.2-Codex |
Both use structured output with constrained decoding — the LLM returns valid JSON every time.
Before clicking Draft PR, add notes to the feedback card. Notes are human-written annotations — "the real issue is missing pagination", "look at how get_costs handles this", "this should be a new tool, not a param on the existing one." These get sent to the LLM as prioritized developer context, so the generated PR reflects what you actually want, not just what the agent reported.
Notes are append-only with timestamps, so you build up context over multiple review sessions. The progress modal shows when notes are being included.
First PR not quite right? Click Re-draft to generate a new one. Refine your notes between attempts — the LLM sees the updated context each time. The old PR stays on GitHub; the dashboard link updates to point to the new one.
Every feedback item includes:
| Field | Required | Purpose |
|---|---|---|
what_i_needed |
Yes | The missing capability — the core signal |
what_i_tried |
Yes | What the agent attempted first. Separates "didn't find it" from "doesn't exist" |
gap_type |
Yes | missing_tool · incomplete_results · missing_parameter · wrong_format · other |
suggestion |
No | The agent's proposed fix. Often includes a full tool signature. |
user_goal |
No | What the human was trying to do. Prioritize by real user impact. |
resolution |
No | blocked · worked_around · partial |
tools_available |
No | What tools the agent could see. Context for the gap. |
agent_model |
No | Which model reported it. Separate model confusion from real gaps. |
session_id |
No | Groups feedback from one conversation. Reveals multi-step failures. |
client_type |
No | Which MCP client reported it (claude-desktop, cursor, claude-code). |
Notes are append-only with timestamps — you never lose an annotation.
| Variable | Default | Description |
|---|---|---|
FEEDBACK_SIDECAR_URL |
http://localhost:8099 |
Where drop-ins send feedback |
FEEDBACK_API_KEY |
(none) | Optional shared secret for auth |
FEEDBACK_DB_PATH |
./feedback.db |
SQLite path for the sidecar |
FEEDBACK_PORT |
8099 |
Port for uv run server.py |
Draft PR settings (GitHub PAT, API keys) are stored in a .env file that's gitignored — not in the database.
The sidecar is designed for local development — localhost:8099 with no auth by default. For shared or remote deployments:
- Set
FEEDBACK_API_KEYto a shared secret. Drop-ins and the sidecar both read it — requests without a validAuthorization: Bearer <key>header are rejected. - Put the sidecar behind HTTPS (nginx, Caddy, etc.) if it's not on localhost.
- GitHub PATs and LLM API keys are stored in
.env, never in SQLite or API responses. The settings endpoint masks keys to their last 4 characters.
| Method | Endpoint | Description |
|---|---|---|
POST |
/api/feedback |
Submit feedback (called by drop-ins) |
GET |
/api/feedback |
List feedback with filters |
GET |
/api/feedback/{id} |
Single item with notes |
PATCH |
/api/feedback/{id} |
Toggle reviewed status |
POST |
/api/feedback/{id}/notes |
Add a note |
POST |
/api/feedback/{id}/draft-pr |
Generate a draft PR (SSE stream) |
GET |
/api/stats |
Counts by server, gap type, resolution |
GET |
/api/settings |
Current settings (keys masked) |
PUT |
/api/settings |
Update settings |
The entire sidecar is one Python file (server.py). No framework, no build step, no Docker required. FastAPI + SQLite + inline HTML/CSS/JS.
patchworkmcp/
server.py # Everything: API, database, UI, GitHub client, LLM integration
feedback.db # Created on first run
.env # API keys (gitignored)
drop-ins/
python/ # FastMCP, Django MCP, raw SDK
typescript/ # @modelcontextprotocol/sdk
go/ # mcp-go
rust/ # reqwest-based
docs/ # Screenshots and demo media
The sidecar API is the stable contract. Any language can participate by:
- Defining the tool schema (name, description, input properties)
- POSTing JSON to
{SIDECAR_URL}/api/feedback - Providing a framework-specific registration helper
If you build one, open a PR. The pattern: one file, zero extra deps beyond the MCP SDK.
What you see today is a developer tool: feedback comes in, you review it, you click a button, you get a PR. That's useful, but it's step one.
The end state is a self-monitoring system for MCP servers. Feedback accumulates across sessions, users, and agents. PatchworkMCP learns to deduplicate reports, cluster related gaps, and grade them by frequency and severity. Instead of acting on the first report of a missing tool, it waits — sees that 15 different sessions hit the same wall, that 8 of them were fully blocked, and that the agents all suggested roughly the same fix. Then it acts.
The human stays in the loop, but how much control you want is a spectrum:
| Level | What happens | Who decides |
|---|---|---|
| Suggestions only | PatchworkMCP surfaces prioritized gaps with analysis | You build it yourself |
| Draft PRs | Generates a PR for review (where we are today) | You review and merge |
| Auto-PRs | Opens PRs automatically when confidence is high | You merge |
| Auto-merge | Ships vetted changes to your server | Guardrails + your approval rules |
Every level up requires more guardrails — confidence thresholds, test coverage requirements, scope limits, rollback hooks. We're building those incrementally, not shipping "auto-merge" as a flag you can flip tomorrow.
- Feedback capture and review dashboard
- Drop-ins for Python, TypeScript, Go, Rust
- Append-only notes with timestamps
- LLM-powered draft PRs from feedback (Anthropic + OpenAI)
- Structured output enforcement (constrained decoding)
- Real-time progress streaming during PR creation
- Developer notes as LLM context for better PRs
- Re-draft workflow for iterating on PRs
- Dark / light theme
- Feedback deduplication and clustering (group similar reports across sessions)
- Severity scoring based on frequency, resolution type, and user impact
- Multi-file PRs
- Webhook notifications for new feedback
- Confidence-gated auto-PRs with configurable thresholds
- Export to CSV/JSON
MIT


