Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,8 @@ The Tool workflow (tools-only, not MCP tasks protocol)
3. non-tool step: draft/approve prompt
4. `task_create`
5. `task_status` (poll every 5 minutes until done)
6. download the result via `task_download` or via `task_file_info`
6. optional if failed: `task_retry`
7. download the result via `task_download` or via `task_file_info`

Concurrency note: each `task_create` call returns a new `task_id`; server-side global per-client concurrency is not capped, so clients should track their own parallel tasks.

Expand Down
8 changes: 3 additions & 5 deletions docs/ai_providers/openrouter.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,17 +8,15 @@ For new users, OpenRouter is the recommended starting point. When you have have

[OpenRouter](https://openrouter.ai/) provides access to a large number of LLM models, that runs in the cloud.

Unfortunately there is no `free` model that works reliable with PlanExe.
Unfortunately there is no `free` model that works reliable with PlanExe. When I use a `free` model on OpenRouter, then most of the times PlanExe fails to create a plan. My impression is that the `free` models are unreliable and slow, I guess the AI providers doesn't treat `free` models as high priority.

In my experience, the `paid` models are the most reliable. Models like [google/gemini-2.0-flash-001](https://openrouter.ai/google/gemini-2.0-flash-001). and [openai/gpt-4o-mini](https://openrouter.ai/openai/gpt-4o-mini) are cheap and faster than running models on my own computer and without risk of it overheating.

I haven't been able to find a `free` model on OpenRouter that works well with PlanExe.

Avoid pricey `paid` models. PlanExe does more than 100 LLM inference calls per plan, so each run uses many tokens. With a cheap model, creating a full plan costs less than 0.30 USD; with one of the newest models, the price can exceed 20 USD. To keep PlanExe affordable for as many users as possible, the defaults use older, cheaper models.
Avoid pricey `paid` models. PlanExe does more than 100 LLM inference calls per plan, so each run uses many tokens. With a cheap model, creating a full plan costs less than 0.50 USD; with one of the newest models, the price can exceed 20 USD. To keep PlanExe affordable for as many users as possible, the defaults use older, cheaper models.

## Quickstart (Docker)

1. Install Docker (with Docker Compose) — no local Python or pip is needed now.
1. Install Docker (with Docker Compose) — no local Python or pip is needed.
2. Clone the repo and enter it:
```
git clone https://github.com/PlanExeOrg/PlanExe.git
Expand Down
1 change: 1 addition & 0 deletions docs/mcp/inspector.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,7 @@ model_profiles
task_create
task_status
task_stop
task_retry
task_file_info
```

Expand Down
27 changes: 27 additions & 0 deletions docs/mcp/mcp_details.md
Original file line number Diff line number Diff line change
Expand Up @@ -168,6 +168,25 @@ Example call:
{"task_id": "2d57a448-1b09-45aa-ad37-e69891ff6ec7"}
```

### task_retry

Retry a failed task by requeueing the same `task_id`.

Example prompt:
```
Retry failed task 2d57a448-1b09-45aa-ad37-e69891ff6ec7 with baseline profile.
```

Example call:
```json
{"task_id": "2d57a448-1b09-45aa-ad37-e69891ff6ec7", "model_profile": "baseline"}
```

Notes:
- `model_profile` is optional and defaults to `baseline`.
- Only failed tasks can be retried.
- Non-failed tasks return `TASK_NOT_FAILED`.

### task_file_info

Return download metadata for report or zip artifacts.
Expand Down Expand Up @@ -245,6 +264,7 @@ Error payload shape:

Common cloud/core error codes:
- `TASK_NOT_FOUND`
- `TASK_NOT_FAILED`
- `INVALID_USER_API_KEY`
- `USER_API_KEY_REQUIRED`
- `INSUFFICIENT_CREDITS`
Expand Down Expand Up @@ -322,6 +342,13 @@ Tool call:
{"task_id": "<task_id_from_task_create>"}
```

If state is `failed`, optional retry:

Tool call:
```json
{"task_id": "<task_id_from_task_create>", "model_profile": "baseline"}
```

### 6. Download the report

Prompt:
Expand Down
25 changes: 25 additions & 0 deletions docs/mcp/mcp_registry.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,3 +57,28 @@ curl "https://registry.modelcontextprotocol.io/v0.1/servers?search=io.github.Pla
```

If found in registry search, it should become discoverable in the GitHub MCP Registry UI at [https://github.com/mcp](https://github.com/mcp).

## 5) Claim on Glama

Glama connector claim verification expects a public well-known file:

- `https://mcp.planexe.org/.well-known/glama.json`

PlanExe serves this from `mcp_cloud/http_server.py` with the schema:

```json
{
"$schema": "https://glama.ai/mcp/schemas/connector.json",
"maintainers": [
{
"email": "neoneye@gmail.com"
}
]
}
```

If needed, override the email via environment variable:

```bash
PLANEXE_MCP_GLAMA_MAINTAINER_EMAIL=neoneye@gmail.com
```
6 changes: 4 additions & 2 deletions docs/mcp/mcp_setup.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,8 @@ This is the shortest path to a working PlanExe MCP integration.
Use this compact shape: objective, scope, constraints, timeline, stakeholders, budget/resources, and success criteria.
4. Create the plan task.
5. Poll for status (about every 5 minutes).
6. Download artifacts via `task_file_info` (cloud) or `task_download` (mcp_local helper).
6. If status is `failed`, optionally call `task_retry` (defaults to `model_profile=baseline`).
7. Download artifacts via `task_file_info` (cloud) or `task_download` (mcp_local helper).

---

Expand All @@ -26,7 +27,8 @@ This is the shortest path to a working PlanExe MCP integration.
2. `model_profiles`
3. `task_create`
4. `task_status`
5. `task_file_info`
5. `task_retry` (optional, only for failed tasks)
6. `task_file_info`

Optional local helper:
- `task_download` (provided by `mcp_local`, not `mcp_cloud`)
Expand Down
1 change: 1 addition & 0 deletions docs/mcp/mcp_welcome.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ No MCP experience is required to get started.
- **Get example prompts** — See what good prompts look like (detailed, typically ~300-800 words). It is the **caller’s responsibility** to take inspiration from these examples and ensure the prompt sent to PlanExe is of similar or better quality. A compact prompt shape works best: objective, scope, constraints, timeline, stakeholders, budget/resources, and success criteria. The agent can refine a vague idea into a high-quality prompt and show it to the user for approval before creating the plan.
- **Create a plan** — Send a prompt; PlanExe starts creating the plan (typically takes 10–20 minutes on baseline profile). If the input prompt is of low quality, the output plan will be crap too. Visible `task_create` options include `model_profile`.
- **Check progress** — Ask for status and see how far the plan has gotten.
- **Retry failed runs** — If status is `failed`, call `task_retry` (defaults to baseline model profile) to requeue the same task id.
- **Download the report** — When the plan is ready, the user specifies whether to download the HTML report or the zip of intermediary files (JSON, MD, CSV).

Developer note: `speed_vs_detail` is intentionally hidden from the visible `task_create` interface and is provided via tool-specific metadata when needed.
Expand Down
58 changes: 52 additions & 6 deletions docs/mcp/planexe_mcp_interface.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ The plan is a **project plan**: a DAG of steps (Luigi tasks) that produce artifa
Implementors should expose the following to agents so they understand what PlanExe does:

- **What:** PlanExe turns a plain-English goal into a strategic project-plan draft (20+ sections) in ~10–20 min. Sections include executive summary, interactive Gantt charts, investor pitch, SWOT, governance, team profiles, work breakdown, scenario comparison, expert criticism, and adversarial sections (premortem, self-audit, premise attacks) that stress-test the plan. The output is a draft to refine, not an executable or final document — but it surfaces hard questions the prompter may not have considered.
- **Required interaction order:** Call `prompt_examples` first. Optional before `task_create`: call `model_profiles` to inspect profile guidance and available models in each profile. Then complete a non-tool step: formulate a detailed prompt as flowing prose (not structured markdown), typically ~300-800 words, using the examples as a baseline; include objective, scope, constraints, timeline, stakeholders, budget/resources, and success criteria; get user approval. Only after approval, call `task_create`. Then poll `task_status` (about every 5 minutes); use `task_download` (mcp_local helper) or `task_file_info` (mcp_cloud tool) when complete (`pending`/`processing` = keep polling, `completed` = download now, `failed` = terminal). To stop, call `task_stop` with the `task_id` from `task_create`.
- **Required interaction order:** Call `prompt_examples` first. Optional before `task_create`: call `model_profiles` to inspect profile guidance and available models in each profile. Then complete a non-tool step: formulate a detailed prompt as flowing prose (not structured markdown), typically ~300-800 words, using the examples as a baseline; include objective, scope, constraints, timeline, stakeholders, budget/resources, and success criteria; get user approval. Only after approval, call `task_create`. Then poll `task_status` (about every 5 minutes); use `task_download` (mcp_local helper) or `task_file_info` (mcp_cloud tool) when complete (`pending`/`processing` = keep polling, `completed` = download now, `failed` = terminal). If a task fails and the caller wants another attempt for the same `task_id`, call `task_retry` (optional `model_profile`, default `baseline`). To stop, call `task_stop` with the `task_id` from `task_create`.
- **Output:** Self-contained interactive HTML report (~700KB) with collapsible sections and interactive Gantt charts — open in a browser. The zip contains the intermediary pipeline files (md, json, csv) that fed the report.

### 1.3 Scope of this document
Expand Down Expand Up @@ -70,10 +70,10 @@ The interface is designed to support:

The MCP specification defines two different mechanisms:

- **MCP tools** (e.g. task_create, task_status, task_stop): the server exposes named tools; the client calls them and receives a response. PlanExe's interface is **tool-based**: the agent calls task_create → receives task_id → polls task_status → uses task_file_info (and optionally task_download via mcp_local). This document specifies those tools.
- **MCP tools** (e.g. task_create, task_status, task_stop, task_retry): the server exposes named tools; the client calls them and receives a response. PlanExe's interface is **tool-based**: the agent calls task_create → receives task_id → polls task_status → optionally calls task_retry on failed → uses task_file_info (and optionally task_download via mcp_local). This document specifies those tools.
- **MCP tasks protocol** ("Run as task" in some UIs): a separate mechanism where the client can run a tool "as a task" using RPC methods such as tasks/run, tasks/get, tasks/result, tasks/cancel, tasks/list, so the tool runs in the background and the client polls for results.

PlanExe **does not** use or advertise the MCP tasks protocol. Implementors and clients should use the **tools only**. Do not enable "Run as task" for PlanExe; many clients (e.g. Cursor) and the Python MCP SDK do not support the tasks protocol properly. Intended flow: call `prompt_examples`; optionally call `model_profiles`; perform the non-tool prompt drafting/approval step; call `task_create`; poll `task_status`; then call `task_file_info` (or `task_download` via mcp_local).
PlanExe **does not** use or advertise the MCP tasks protocol. Implementors and clients should use the **tools only**. Do not enable "Run as task" for PlanExe; many clients (e.g. Cursor) and the Python MCP SDK do not support the tasks protocol properly. Intended flow: call `prompt_examples`; optionally call `model_profiles`; perform the non-tool prompt drafting/approval step; call `task_create`; poll `task_status`; if failed call `task_retry` (optional); then call `task_file_info` (or `task_download` via mcp_local) when completed.

---

Expand Down Expand Up @@ -142,6 +142,7 @@ The public MCP `state` field is aligned with `TaskItem.state`:
- pending → processing when picked up by a worker
- processing → completed via normal success
- processing → failed via error
- failed → pending when `task_retry` is accepted

### 5.3 Invalid transitions

Expand Down Expand Up @@ -322,7 +323,7 @@ For the full catalog file:

**Important**

- task_id is a UUID returned by task_create. Use this exact UUID for task_status/task_stop/task_file_info (and task_download when using mcp_local).
- task_id is a UUID returned by task_create. Use this exact UUID for task_status/task_stop/task_retry/task_file_info (and task_download when using mcp_local).

**Behavior**

Expand Down Expand Up @@ -417,7 +418,49 @@ Requests the plan generation to stop. Pass the **task_id** (the UUID returned by

---

### 6.5 Download flow (task_download vs task_file_info)
### 6.5 task_retry

Retries a task that is currently in `failed` state.

**Request**

```json
{
"task_id": "5e2b2a7c-8b49-4d2f-9b8f-6a3c1f05b9a1",
"model_profile": "baseline"
}
```

**Input**

- task_id: UUID of a failed task.
- model_profile: optional (`baseline` | `premium` | `frontier` | `custom`), default `baseline`.

**Response**

```json
{
"task_id": "5e2b2a7c-8b49-4d2f-9b8f-6a3c1f05b9a1",
"state": "pending",
"model_profile": "baseline",
"retried_at": "2026-02-24T15:20:00Z"
}
```

**Required semantics**

- Only failed tasks are retryable.
- On success, the same task_id is reset to `pending` and requeued.
- Prior artifacts for that task are cleared before requeue.

**Error behavior**

- Unknown task_id: `TASK_NOT_FOUND` (`isError=true`).
- Task not failed: `TASK_NOT_FAILED` (`isError=true`).

---

### 6.6 Download flow (task_download vs task_file_info)

**If your client exposes task_download** (e.g. mcp_local): use it to save the report or zip locally; it calls task_file_info under the hood, then fetches and writes to the local save path (e.g. PLANEXE_PATH).

Expand Down Expand Up @@ -473,6 +516,7 @@ Recommended practice for MCP clients:
Additional semantics:

- Every `task_create` call creates a new independent task with a new `task_id`.
- `task_retry` reuses the existing failed `task_id` (it does not create a new task id).
- The server does not deduplicate “same prompt” requests into a single shared task.
- Keep your own task registry/client state if you run multiple tasks concurrently.

Expand Down Expand Up @@ -501,7 +545,7 @@ Example:

### 9.2 isError behavior

- `task_create`, `task_status`, `task_stop`: unknown/invalid requests return `isError=true` with `error`.
- `task_create`, `task_status`, `task_stop`, `task_retry`: unknown/invalid requests return `isError=true` with `error`.
- `model_profiles`: returns `isError=true` with `MODEL_PROFILES_UNAVAILABLE` when no models are available in any profile.
- `task_file_info`: uses mixed behavior:
- returns `{}` (not an error) while artifacts are not ready.
Expand All @@ -516,6 +560,7 @@ Cloud/core tool codes:
- `INVALID_TOOL`: unknown MCP tool name.
- `INTERNAL_ERROR`: uncaught server error.
- `TASK_NOT_FOUND`: task id not found.
- `TASK_NOT_FAILED`: task_retry called for a task that is not in failed state.
- `INVALID_USER_API_KEY`: provided user_api_key is invalid.
- `USER_API_KEY_REQUIRED`: deployment requires user_api_key for task_create.
- `INSUFFICIENT_CREDITS`: caller account has no credits for task_create.
Expand All @@ -539,6 +584,7 @@ Local proxy specific codes:
- `USER_API_KEY_REQUIRED`
- `INSUFFICIENT_CREDITS`
- `INVALID_TOOL`
- For `TASK_NOT_FAILED`: call `task_retry` only after `task_status.state == failed`.
- For `TASK_NOT_FOUND`: verify task_id source and stop polling that id.
- For `generation_failed`: treat as terminal failure and surface task progress_message to user.

Expand Down
8 changes: 5 additions & 3 deletions docs/railway.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,13 @@
title: Deploy PlanExe on Railway
---

# PlanExe on Railway - Experimental
# PlanExe on Railway

As of 2026-Jan-04, I'm experimenting with Railway. Currently the `frontend_multi_user` UI is an ugly MVP. I recommend going with the `frontend_single_user`, that doesn't use database.
This is what PlanExe looks like when it's deployed on Railway:
- Website: [home.planexe.org](https://home.planexe.org/)
- MCP interface: [mcp.planexe.org](https://mcp.planexe.org/)

In this project, the files named `railway.md` or `railway.toml`, are related to how things are configured in my Railway setup.
You can deploy PlanExe yourself on Railway. It's not straightforward to get working. I recommend first getting docker working on localhost, when that works, then move on to Railway. There are many files related to railway, these are named `railway.md` or `railway.toml`, and describes how things are configured in my Railway setup.

## Project Settings

Expand Down
2 changes: 1 addition & 1 deletion frontend_multi_user/src/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -2027,7 +2027,7 @@ def index():
if is_admin:
user_id = self.admin_username
user = SimpleNamespace(name="Admin", given_name=None)
credits_balance_display = "N/A"
credits_balance_display = "Full access"
can_create_plan = True
else:
user_uuid = uuid.UUID(str(current_user.id))
Expand Down
24 changes: 21 additions & 3 deletions frontend_multi_user/templates/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -167,6 +167,24 @@
.stat-value.credits {
color: #059669;
}
.stat-card-link {
display: block;
text-decoration: none;
color: inherit;
transition: border-color 0.15s, box-shadow 0.15s;
}
.stat-card-link:hover {
border-color: var(--color-primary);
box-shadow: 0 0 0 1px var(--color-primary);
}
.stat-card-link .stat-value.credits {
text-decoration: underline;
text-underline-offset: 3px;
text-decoration-color: rgba(5, 150, 105, 0.4);
}
.stat-card-link:hover .stat-value.credits {
text-decoration-color: var(--color-primary);
}
.stat-label {
font-size: 0.8rem;
color: var(--color-text-secondary);
Expand Down Expand Up @@ -531,10 +549,10 @@ <h1>Welcome back, {{ user.name or user.given_name or "there" }}</h1>
</div>

<div class="stats-row">
<div class="stat-card">
<a href="{{ url_for('account') }}" class="stat-card stat-card-link" title="Manage credits and add more via Stripe">
<div class="stat-value credits">{{ credits_balance_display }}</div>
<div class="stat-label">Credits</div>
</div>
<div class="stat-label">Credits{% if not can_create_plan and not is_admin %} · Add more{% endif %}</div>
</a>
<div class="stat-card">
<div class="stat-value">{{ total_tasks_count }}</div>
<div class="stat-label">Plans</div>
Expand Down
Loading