Skip to content

feat: MCP server for programmatic tmux control#642

Draft
tony wants to merge 47 commits intomasterfrom
mcp
Draft

feat: MCP server for programmatic tmux control#642
tony wants to merge 47 commits intomasterfrom
mcp

Conversation

@tony
Copy link
Member

@tony tony commented Mar 8, 2026

Summary

Add an MCP (Model Context Protocol) server so AI agents (Claude Desktop, Claude Code, Codex, Gemini CLI, Cursor) can programmatically control tmux sessions.

  • 25 tools across 6 modules: server, session, window, pane, options, environment
  • 6 tmux:// resources for browsing tmux hierarchy via URI patterns
  • Socket isolation via LIBTMUX_SOCKET / LIBTMUX_SOCKET_PATH env vars for sub-agent safety
  • Server caching by (socket_name, socket_path, tmux_bin) tuple with is_alive() eviction
  • Input validation on direction and scope parameters with clear error messages
  • Full type safety (mypy strict passes) and 58 tests (all passing)

Architecture

src/libtmux/mcp/
    __init__.py           # Entry point: main()
    __main__.py           # python -m libtmux.mcp support
    server.py             # FastMCP instance
    _utils.py             # Server caching, resolvers, serializers, error handling
    tools/
        server_tools.py   # list_sessions, create_session, kill_server, get_server_info
        session_tools.py  # list_windows, create_window, rename_session, kill_session
        window_tools.py   # list_panes, split_window, rename_window, kill_window, select_layout, resize_window
        pane_tools.py     # send_keys, capture_pane, resize_pane, kill_pane, set_pane_title, get_pane_info, clear_pane
        option_tools.py   # show_option, set_option
        env_tools.py      # show_environment, set_environment
    resources/
        hierarchy.py      # tmux:// URI resources

Quick Start

Important

This feature is on the mcp branch and not yet released to PyPI. The methods below pull a snapshot from the branch at the time of install. To pick up new commits, re-run the install or add --reinstall to uvx.

Note

The uvx and uv commands require uv to be installed. Install it with curl -LsSf https://astral.sh/uv/install.sh | sh or see the uv installation docs.

One-liner setup (no clone needed)

The fastest way — uvx handles clone, deps, and execution automatically:

Claude Code:

$ claude mcp add libtmux -- uvx --from "git+https://github.com/tmux-python/libtmux.git@mcp[mcp]" libtmux-mcp

Codex CLI:

$ codex mcp add libtmux -- uvx --from "git+https://github.com/tmux-python/libtmux.git@mcp[mcp]" libtmux-mcp

Gemini CLI:

$ gemini mcp add libtmux uvx -- --from "git+https://github.com/tmux-python/libtmux.git@mcp[mcp]" libtmux-mcp

Cursor does not have an mcp add CLI command — use the JSON config below.

JSON config (all tools)

The same uvx pattern works in every tool's config file:

{
    "mcpServers": {
        "libtmux": {
            "command": "uvx",
            "args": ["--from", "git+https://github.com/tmux-python/libtmux.git@mcp[mcp]", "libtmux-mcp"],
            "env": {
                "LIBTMUX_SOCKET": "ai_workspace"
            }
        }
    }
}
Tool Config file Format
Claude Code .mcp.json (project) or ~/.claude.json (global) JSON
Claude Desktop claude_desktop_config.json JSON
Codex CLI ~/.codex/config.toml TOML (see below)
Gemini CLI ~/.gemini/settings.json JSON
Cursor .cursor/mcp.json (project) or ~/.cursor/mcp.json (global) JSON
Codex CLI config.toml format
[mcp_servers.libtmux]
command = "uvx"
args = ["--from", "git+https://github.com/tmux-python/libtmux.git@mcp[mcp]", "libtmux-mcp"]

Development Install

For contributing or modifying the MCP server, use an editable install instead.

Clone the mcp branch:

$ git clone -b mcp https://github.com/tmux-python/libtmux.git ~/work/python/libtmux-mcp

Install in editable mode with the mcp extra:

$ cd ~/work/python/libtmux-mcp
$ uv pip install -e ".[mcp]"

Run the server:

$ libtmux-mcp

With an editable install, code changes take effect immediately — no reinstall needed.

Local checkout CLI setup

Point your tool at the local checkout via uv --directory. Changes in the worktree take effect immediately — no reinstall or snapshot refresh needed.

Claude Code:

$ claude mcp add --scope user libtmux -- uv --directory ~/work/python/libtmux-mcp run libtmux-mcp

Codex CLI:

$ codex mcp add libtmux -- uv --directory ~/work/python/libtmux-mcp run libtmux-mcp

Gemini CLI:

$ gemini mcp add --scope user libtmux uv -- --directory ~/work/python/libtmux-mcp run libtmux-mcp

Cursor — add to ~/.cursor/mcp.json:

{
    "mcpServers": {
        "libtmux": {
            "command": "uv",
            "args": [
                "--directory", "~/work/python/libtmux-mcp",
                "run", "libtmux-mcp"
            ]
        }
    }
}
Codex CLI config.toml format (local checkout)
[mcp_servers.libtmux]
command = "uv"
args = ["--directory", "~/work/python/libtmux-mcp", "run", "libtmux-mcp"]

Snapshot Install (pip / uv pip)

If you prefer a traditional install without uvx:

Using uv:

$ uv pip install "libtmux[mcp] @ git+https://github.com/tmux-python/libtmux.git@mcp"

Using pip:

$ pip install "libtmux[mcp] @ git+https://github.com/tmux-python/libtmux.git@mcp"

To update to the latest snapshot after new commits:

$ uv pip install --reinstall-package libtmux "libtmux[mcp] @ git+https://github.com/tmux-python/libtmux.git@mcp"

Environment Variables

Variable Purpose
LIBTMUX_SOCKET tmux socket name (-L). Isolates the MCP server to a specific socket.
LIBTMUX_SOCKET_PATH tmux socket path (-S). Alternative to socket name.
LIBTMUX_TMUX_BIN Path to tmux binary. Useful for testing with different tmux versions.

Test plan

  • uv run ruff check . passes
  • uv run ruff format . passes
  • uv run mypy passes (strict)
  • uv run pytest tests/mcp/ -v passes (58 tests)
  • Test with MCP Inspector: npx @modelcontextprotocol/inspector
  • Test with Claude Desktop
  • Test with Claude Code via .mcp.json

@codecov
Copy link

codecov bot commented Mar 8, 2026

Codecov Report

❌ Patch coverage is 80.75370% with 143 lines in your changes missing coverage. Please review.
✅ Project coverage is 58.04%. Comparing base (78b96c0) to head (de621ec).

Files with missing lines Patch % Lines
src/libtmux/mcp/_utils.py 70.58% 35 Missing and 10 partials ⚠️
src/libtmux/mcp/resources/hierarchy.py 67.85% 12 Missing and 6 partials ⚠️
src/libtmux/mcp/tools/server_tools.py 75.00% 6 Missing and 6 partials ⚠️
src/libtmux/mcp/server.py 68.57% 10 Missing and 1 partial ⚠️
src/libtmux/mcp/tools/pane_tools.py 92.19% 7 Missing and 4 partials ⚠️
src/libtmux/mcp/tools/session_tools.py 81.03% 7 Missing and 4 partials ⚠️
src/libtmux/mcp/tools/window_tools.py 85.71% 8 Missing and 2 partials ⚠️
src/libtmux/mcp/middleware.py 63.63% 8 Missing ⚠️
src/libtmux/mcp/__init__.py 12.50% 7 Missing ⚠️
src/libtmux/mcp/tools/option_tools.py 79.41% 6 Missing and 1 partial ⚠️
... and 1 more
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #642      +/-   ##
==========================================
+ Coverage   51.19%   58.04%   +6.84%     
==========================================
  Files          25       40      +15     
  Lines        2590     3332     +742     
  Branches      402      493      +91     
==========================================
+ Hits         1326     1934     +608     
- Misses       1094     1196     +102     
- Partials      170      202      +32     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@tony
Copy link
Member Author

tony commented Mar 15, 2026

Screenshot today

This automatically was called up by claude code - i forgot about it - and it found a vite server was running, and what port it was running on ❗- that way it could serve it to playwright mcp.

2026-03-15 - libtmux - first mcp call

tony added 15 commits March 21, 2026 05:54
why: AI agents need programmatic tmux control via MCP protocol.
what:
- Add fastmcp optional dependency in pyproject.toml
- Add libtmux-mcp entry point script
- Create _utils.py with server caching, object resolvers, serializers,
  and @handle_tool_errors decorator
- Create server.py with FastMCP instance and registration
- Add __init__.py and __main__.py entry points
why: Provide comprehensive tool coverage for AI agents to manage tmux.
what:
- server_tools: list_sessions, create_session, kill_server, get_server_info
- session_tools: list_windows, create_window, rename_session, kill_session
- window_tools: list_panes, split_window, rename_window, kill_window,
  select_layout, resize_window
- pane_tools: send_keys, capture_pane, resize_pane, kill_pane,
  set_pane_title, get_pane_info, clear_pane
- option_tools: show_option, set_option
- env_tools: show_environment, set_environment
why: MCP resources let agents browse tmux state via URI patterns.
what:
- tmux://sessions - list all sessions
- tmux://sessions/{session_name} - session detail with windows
- tmux://sessions/{session_name}/windows - windows in session
- tmux://sessions/{session_name}/windows/{window_index} - window with panes
- tmux://panes/{pane_id} - pane details
- tmux://panes/{pane_id}/content - captured pane text
why: Ensure MCP server functionality works correctly with live tmux.
what:
- Add conftest.py with mcp_server, mcp_session, mcp_window, mcp_pane
  fixtures and server cache cleanup
- Add test_utils.py for resolver and serializer functions
- Add test files for all 6 tool modules
- Add test_resources.py with mock MCP for resource functions
why: Cache key only included (socket_name, socket_path), so changing
LIBTMUX_TMUX_BIN between calls returned a stale Server. Dead servers
were never evicted from the cache.
what:
- Change cache key to 3-tuple (socket_name, socket_path, tmux_bin)
- Add is_alive() check on cache hit to evict dead servers
- Add _invalidate_server() for explicit cache eviction
- Call _invalidate_server() in kill_server tool after server.kill()
- Update test fixtures for 3-tuple cache keys
- Add tests for is_alive eviction and _invalidate_server
… zoom

why: pane.resize_pane() always raises DeprecatedError since libtmux
v0.28, making the resize_pane MCP tool 100% broken.
what:
- Replace pane.resize_pane() with pane.resize() for height/width
- Add state-aware zoom: check window_zoomed_flag before toggling
- Add mutual exclusivity check for zoom + height/width
- Add tests for dimensions, zoom, and mutual exclusivity
why: Raw pane.cmd("kill-pane") skips stderr checking and structured
logging that Pane.kill() provides.
what:
- Replace pane.cmd("kill-pane") with pane.kill()
why: Raw window.cmd("resize-window") skips stderr checking and
self.refresh() that Window.resize() provides.
what:
- Replace raw cmd with window.resize(height=height, width=width)
- Add test for resize_window tool
why: split_window silently ignored invalid directions (fell to default),
create_window raised KeyError surfaced as "Unexpected error".
what:
- split_window: check _DIRECTION_MAP.get() result, raise ToolError if None
- create_window: use .get() with explicit ToolError on invalid direction
- Add tests for invalid direction in both tools
why: Invalid scope silently fell through to server scope, making it
impossible for users to detect typos like "global" vs "server".
what:
- Check _SCOPE_MAP.get() result, raise ToolError if scope is invalid
- Add test for invalid scope
why: Broad except Exception blocks caught all errors and returned them
as content strings, hiding real errors from the MCP client. FastMCP
natively converts unhandled exceptions to ResourceError.
what:
- Remove all 6 try/except Exception blocks from resource functions
- Raise ValueError for not-found sessions/windows/panes
- Remove unused logger import
why: Generic exceptions were re-raised as ToolError without logging,
making it impossible to diagnose unexpected errors in server logs.
what:
- Add logger.exception() before re-raising generic Exception as ToolError
why: MCP list tools need to expose libtmux's QueryList filtering via
an optional dict parameter, requiring a bridge between MCP dict params
and QueryList.filter(**kwargs).
what:
- Add _apply_filters() that validates operator keys against LOOKUP_NAME_MAP
- Raise ToolError with valid operators list on invalid lookup operator
- Short-circuit to direct serialization when filters is None/empty
- Add 6 parametrized tests: none, empty, exact, no_match, invalid_op, contains
why: LLM agents need to search across tmux objects without knowing the
exact hierarchy, and filter results using QueryList's 12 lookup operators.
what:
- Add optional filters param to list_sessions, list_windows, list_panes
- Broaden list_windows scope: omit session params to list all server windows
- Broaden list_panes scope: window > session > server fallback chain
- Add 9 parametrized tests for list_sessions filtering
- Add 7 parametrized tests for list_windows filtering + cross-session scope
- Add 7 parametrized tests for list_panes filtering + session/server scope
why: Cursor's composer-1/composer-1.5 models and some other MCP clients
cannot serialize nested dict tool arguments — they either stringify the
object or fail with a JSON parse error before dispatching. Claude and
GPT models through Cursor work fine; the bug is model-specific.

refs:
- https://forum.cursor.com/t/145807 (Dec 2025)
- https://forum.cursor.com/t/132571
- https://forum.cursor.com/t/151180 (Feb 2026)
- makenotion/notion-mcp-server#176 (Jan 2026)
- anthropics/claude-code#5504

what:
- Widen _apply_filters() to accept str, parse via json.loads()
- Widen tool signatures to dict | str | None for JSON Schema compat
- Add 5 parametrized test cases for string coercion and error paths
tony added 11 commits March 21, 2026 06:40
why: `getattr(window, "window_zoomed_flag", "0")` always returned the
default `"0"` because `window_zoomed_flag` is not a field on libtmux's
`Window` object. This caused `zoom=True` on an already-zoomed pane to
toggle it OFF (since `pane.resize(zoom=True)` is a toggle), and
`zoom=False` on a zoomed pane to be a no-op.
what:
- Query zoom state via `window.cmd("display-message", "-p", "#{window_zoomed_flag}")`
- Preserve idempotent semantics: zoom=True ensures zoomed, zoom=False ensures unzoomed
why: When `pane_id` was provided to `split_window`, the code resolved
the pane but then called `window.split()`, which delegates to
`self.active_pane.split()`. If the specified pane was not the active
pane, the wrong pane got split.
what:
- Call `pane.split()` directly when `pane_id` is provided
- Move direction validation before the pane/window branch
- Keep `window.split()` path for window-level targeting
why: When `target` was provided without `scope`, `_resolve_option_target`
silently ignored the target and returned the server object. This caused
`show_option(option="x", target="my_session")` to query the server
instead of the intended session — a fail-open behavior.
what:
- Raise ToolError when target is provided but scope is None
- Add test for target-without-scope error path
why: `clear_pane` called `pane.clear()` which sends the literal text
"reset" + Enter as keystrokes to the pane's foreground process. For
non-shell panes (vim, REPL, TUI), this injects unexpected input. The
tool's contract says "clear the pane" but the annotation says
`destructiveHint: False`, compounding the mismatch.
what:
- Use `pane.reset()` which does tmux-level `send-keys -R \; clear-history`
- This resets the terminal state and clears history without injecting keystrokes
why: `%%1` in plain Python strings is literally two percent signs, not
an escaped `%1`. These docstrings become MCP tool descriptions shown to
AI agents, which would send `%%1` as the pane_id and fail lookups since
tmux pane IDs use a single `%` prefix (`%0`, `%1`, etc.).
what:
- Replace all 11 instances of `%%1` with `%1` across server.py,
  _utils.py, pane_tools.py, and hierarchy.py
why: FastMCP runs sync tool functions in a thread pool via
`anyio.to_thread.run_sync()`. The compound check-then-act pattern in
`_get_server` (check `in`, access `[]`, possibly `del`) was not atomic,
allowing concurrent tool calls to hit a `KeyError` when one thread
deletes a dead server's cache entry between another thread's `in` check
and `[]` access.

Additionally, `_invalidate_server` did not resolve env vars
(`LIBTMUX_SOCKET`, `LIBTMUX_SOCKET_PATH`), so calling
`_invalidate_server(socket_name=None)` would search for `key[0] == None`
but the cache key created by `_get_server` used the resolved env var
value.
what:
- Add `threading.Lock` to protect `_server_cache` in both `_get_server`
  and `_invalidate_server`
- Add env var resolution to `_invalidate_server` to match `_get_server`
why: Resource handlers raised `ValueError` for not-found conditions,
while tool modules consistently use `ToolError` from fastmcp. FastMCP
provides `ResourceError` specifically for resource operation errors.
Using `ValueError` produces inconsistent error presentation to MCP
clients since it's not a `FastMCPError` subclass.
what:
- Import and use `fastmcp.exceptions.ResourceError` for all not-found
  conditions in resource handlers
why: `get_server_info` called `server.is_alive()` twice — once for
the `is_alive` field and once to guard `len(server.sessions)`. Each
call spawns a `tmux list-sessions` subprocess.
what:
- Store `is_alive()` result in local variable and reuse
why: The `libtmux-mcp` console script is always installed via
`[project.scripts]`, but `fastmcp` is only declared in
`[project.optional-dependencies]`. A plain `pip install libtmux` installs
the CLI entrypoint, but invoking it crashes with `ModuleNotFoundError`.
what:
- Catch `ImportError` in `main()` and print a helpful install message
  directing users to `pip install libtmux[mcp]`
why: The MCP spec (2025-06-18) defines 4 tool annotation hints with
defaults that can be misleading — `destructiveHint` defaults to `true`
and `openWorldHint` defaults to `true`. Tools that only set
`readOnlyHint: true` inherited the contradictory `destructiveHint: true`
default. Since all tools interact with local tmux (not external APIs),
`openWorldHint` should be `false` across the board.

Additionally, the MCP spec supports `title` on tools and resources for
human-readable display in MCP clients, but none were set.
what:
- Set all 4 annotations explicitly on all 25 tools (readOnlyHint,
  destructiveHint, idempotentHint, openWorldHint)
- Add human-readable `title` to all 25 tools and 6 resources
- Set `openWorldHint: false` everywhere (local tmux, not external APIs)
- Set `idempotentHint: true` on rename/set/resize/select/kill tools
- Update MockMCP in test_resources.py to accept **kwargs
why: The MCP lifecycle spec shows `serverInfo` with `name`, `title`, and
`version` fields. The server was missing `version`. The instructions
string also lacked the tmux hierarchy model and env var configuration
that help LLMs use tools effectively.
what:
- Add `version` from `libtmux.__about__.__version__`
- Add tmux hierarchy description (Server > Session > Window > Pane)
- Document LIBTMUX_SOCKET env var default for socket_name
tony added 4 commits March 21, 2026 07:51
why: The previous pin `>=2.3.0` was too loose — FastMCP 3.x has a
different API from 2.x (e.g. `title=` kwarg on `mcp.tool()`,
`version=` on constructor). A future 4.x release could also break.
The server uses 3.x features added in 3.1.0.
what:
- Pin fastmcp to `>=3.1.0,<4.0.0`
- Update uv.lock
…split_window

why: All 7 pane tools and `split_window` accepted `session_name` but
not `session_id`, while session/window-level tools consistently accept
both. The `_resolve_pane()` and `_resolve_window()` helpers already
support `session_id` — it just wasn't exposed in the tool signatures.
what:
- Add `session_id: str | None = None` parameter to send_keys,
  capture_pane, resize_pane, kill_pane, set_pane_title, get_pane_info,
  clear_pane, and split_window
- Pass through to _resolve_pane()/_resolve_window()
- Improve capture_pane start/end and split_window size descriptions
  (verified against tmux C source: cmd-capture-pane.c, cmd-split-window.c)
- Clarify suppress_history as libtmux abstraction (space prefix)
why: Several MCP tool parameter descriptions were ambiguous when
cross-referenced against the tmux C source code. LLMs using these tools
need precise format guidance to construct correct arguments.

Verified against:
- layout-set.c:43-49 for built-in layout names
- cmd-split-window.c for size format
- option_tools target format per scope
what:
- select_layout: list all 7 built-in layout names from tmux source
- option_tools target: document expected format per scope (session name,
  window ID '@1', pane ID '%1')
why: Parameters like `direction` and `scope` were typed as `str | None`,
so the MCP input schema showed `{"type": "string"}` — LLMs had to read
descriptions to discover valid values. Pydantic generates
`{"enum": ["above", "below", ...]}` from `Literal` types, putting valid
values directly in the JSON schema where LLMs can see them.
what:
- Use `t.Literal["above", "below", "left", "right"]` for split direction
- Use `t.Literal["before", "after"]` for window placement direction
- Use `t.Literal["server", "session", "window", "pane"]` for option scope
- Keep manual validation as safety net for direct callers (belt-and-suspenders)
tony added 17 commits March 21, 2026 10:50
why: Tools returned manually-constructed JSON strings with no
`outputSchema` in the MCP tool definitions. MCP clients couldn't
validate or introspect results, and tests had to `json.loads()` every
return value.
what:
- Add `models.py` with Pydantic BaseModel classes: SessionInfo,
  WindowInfo, PaneInfo, ServerInfo, OptionResult, OptionSetResult,
  EnvironmentSetResult
- Each field has `Field(description=...)` for MCP schema documentation
- FastMCP auto-generates `outputSchema` from these return types
…N strings

why: Returning typed Pydantic models instead of `json.dumps()` strings
gives MCP clients auto-generated `outputSchema` for result validation,
lets tests assert on model attributes directly (`result.session_name`)
instead of `json.loads(result)["session_name"]`, and centralizes field
documentation in model `Field(description=...)` definitions.
what:
- Update `_serialize_*` functions to return Pydantic models
- Update `_apply_filters` with generic TypeVar for typed returns
- Change tool return types: `str` → `SessionInfo`, `WindowInfo`,
  `PaneInfo`, `ServerInfo`, `OptionResult`, `OptionSetResult`,
  `EnvironmentSetResult`
- Keep `str` returns for message-only tools (kill_*, send_keys,
  clear_pane) and text output (capture_pane, show_environment)
- Update resources to use `.model_dump()` before `json.dumps()`
- Update all tests to assert on model attributes directly
…tions

why: When users ask what panes "contain" or "mention", LLMs default to
metadata-only filters (e.g. window_name__contains) instead of reading
pane contents. MCP server instructions are injected into the LLM's system
prompt and are the primary mechanism for guiding tool selection workflows.
what:
- Add paragraph distinguishing metadata tools (list_*) from content tools
- Reference search_panes and capture_pane as content-search approaches
- Use trigger words ("contain", "mention", "show", "have") that match
  natural user language
why: Tool descriptions are evaluated by LLMs at tool-selection time. Without
explicit scope clarification, LLMs interpret "list windows mentioning X" as
a metadata filter rather than a content search. Adding scope notes and
cross-references to search_panes helps LLMs choose the correct tool.
what:
- Add metadata-only note to list_windows docstring
- Add metadata-only note to list_panes docstring
- Add cross-reference to search_panes in capture_pane docstring
why: LLMs need a single tool to search for text visible in terminal panes.
Without this, content search requires multi-step choreography (list_panes +
capture_pane on each), which LLMs handle unreliably.

The implementation uses tmux's native `list-panes -f "#{C:pattern}"` for
a fast first pass — this runs `window_pane_search()` in C, searching the
pane grid directly without serialization. Only matching panes are then
captured to extract the actual matched lines. When the pattern contains
regex metacharacters, falls back to capturing all panes (tmux's `#{C:}`
uses glob matching, not regex).

Refs:
- #645
- #646
- http://blog.modelcontextprotocol.io/posts/2025-11-03-using-server-instructions/
what:
- Add PaneContentMatch model to models.py
- Add search_panes tool with two-phase tmux-optimized search
- Register as read-only tool with "Search Panes" title
- Add parametrized tests using SearchPanesFixture NamedTuple (7 cases)
- Add standalone tests for model types, parent context, and error handling
- Use retry_until instead of time.sleep for test reliability
…ction

why: When an LLM agent runs inside tmux, the MCP server inherits `TMUX_PANE`
and `TMUX` environment variables. Without self-awareness, tools like
`search_panes` return the agent's own pane in results with no way for the
LLM to distinguish it from other panes — forcing a separate `echo $TMUX_PANE`
shell command. This follows the "inform, never decide" constitutional
principle: enrich metadata without changing tool behavior.
what:
- Add `_get_caller_pane_id()` helper to `_utils.py` reading `TMUX_PANE` env var
- Add `is_caller: bool | None` field to `PaneInfo` and `PaneContentMatch` models
- Annotate `_serialize_pane()` and `search_panes()` results with `is_caller`
- Add `_build_instructions()` to `server.py` that appends agent tmux context
  (pane ID, socket name) to server instructions when `TMUX_PANE` is set
- Add parametrized tests for serializer, search_panes, and instructions builder
why: `send_keys` sends keystrokes to a terminal — it's a side-effecting
action, not a destructive one. MCP's `destructiveHint` means "may perform
destructive updates" like killing sessions or deleting data. MCP clients
gate destructive tools behind extra confirmation dialogs, adding unnecessary
friction to the most-used tool in the server.

Refs:
- https://blog.modelcontextprotocol.io/posts/2026-03-16-tool-annotations/
what:
- Change `destructiveHint` from `True` to `False` in send_keys registration
…r tags

why: Annotation dicts (`_RO`, `_IDEM`, inline destructive dicts) were
duplicated identically across all 6 tool modules — a DRY violation that
makes maintenance error-prone. Additionally, tools had no programmatic
categorization beyond annotations, preventing middleware-based safety
gating.

Refs:
- https://gofastmcp.com/servers/tools (tags parameter)
- https://gofastmcp.com/servers/middleware (tag-based filtering)
what:
- Add TAG_READONLY, TAG_MUTATING, TAG_DESTRUCTIVE constants to _utils.py
- Add ANNOTATIONS_RO, ANNOTATIONS_MUTATING, ANNOTATIONS_CREATE,
  ANNOTATIONS_DESTRUCTIVE presets to _utils.py
- Add VALID_SAFETY_LEVELS frozenset for validation
- Replace all local _RO/_IDEM/inline dicts in 6 tool modules with imports
- Add tags={TAG_*} to all 25 mcp.tool() registrations
- Add tests for constant correctness and completeness
why: The MCP server exposes `kill_server`, `kill_session`, `kill_window`,
and `kill_pane` to any connected client with no safety guardrails beyond
annotation hints. MCP clients SHOULD respect `destructiveHint`, but hints
are advisory — the spec explicitly says they are untrusted.

This middleware provides server-side defense in depth: tools tagged above
the configured tier are hidden from `on_list_tools` AND blocked from
`on_call_tool`. The double-gate prevents both discovery and execution.

Configured via `LIBTMUX_SAFETY` env var:
- `readonly`: only read operations
- `mutating` (default): read + write + send_keys
- `destructive`: all operations including kill_*

Refs:
- https://gofastmcp.com/servers/middleware
- https://blog.modelcontextprotocol.io/posts/2026-03-16-tool-annotations/
what:
- Add SafetyMiddleware class in new middleware.py with _is_allowed(),
  on_list_tools(), and on_call_tool() hooks
- Wire SafetyMiddleware into FastMCP instance in server.py, reading
  LIBTMUX_SAFETY env var with fallback to TAG_MUTATING
- Add parametrized tests (10 tier combinations + default + fallback)
why: The LLM needs to know what tools are available at the current safety
level. Without this, an agent at `readonly` level might attempt mutating
operations and get confusing ToolError messages. Including the safety level
in instructions lets the LLM self-regulate — it knows what tier it's
operating at and what `LIBTMUX_SAFETY` values are available.
what:
- Refactor _build_instructions() to accept safety_level parameter
- Always include "Safety level: {level}" section with tier descriptions
- Update parametrized test fixtures to cover readonly/destructive levels
- Add test asserting safety level text is always present in instructions
why: The resolver chain (`_resolve_session`, `_resolve_window`,
`_resolve_pane`) falls back to `items[0]` when no identifier is provided.
For destructive tools, this means `kill_session()` with no args silently
kills whichever session happens to be first — and calling it twice kills
two different sessions. This is dangerous for an AI-facing control plane
where the LLM might omit a target parameter.

The `idempotentHint: True` on `ANNOTATIONS_DESTRUCTIVE` compounded the
risk — MCP clients that trust the hint might auto-retry on failure,
escalating destruction.
what:
- Add explicit target guards to kill_session, kill_window, kill_pane
  that raise ToolError when no targeting parameter is provided
- Fix ANNOTATIONS_DESTRUCTIVE idempotentHint from True to False
- Add test_kill_*_requires_target tests for all three tools
why: `matching_pane_ids` was collected into a `set()`, which has
nondeterministic iteration order in Python. This caused search_panes
results to appear in different order across calls for the same query,
making agent behavior unpredictable and tests fragile.
what:
- Replace set() with list(dict.fromkeys(...)) for order-preserving dedup
  in both the tmux fast path and Python fallback path
- Sort final matches by pane_id for fully deterministic output
…ry commands

why: `pane.reset()` sent `send-keys -R \; clear-history` as a single
`cmd()` call. Since `subprocess.Popen` is called without `shell=True`,
the `\;` is never interpreted as a tmux command separator — it's passed
as a literal argument to `send-keys`. This means `clear-history` never
executes, and scrollback is never cleared.

Refs:
- #650
what:
- Split into two separate cmd() calls: send-keys -R then clear-history
- Strengthen test_clear_pane to verify marker text disappears from
  scrollback after clearing

Closes #650
why: All 6 resource handlers called `_get_server()` with no arguments,
while every tool function accepts `socket_name`. The server instructions
promise multi-server support via `socket_name`, but resources were
limited to the default socket — a capability gap.

FastMCP supports RFC 6570 query parameters via `{?param}` syntax in
URI templates. Query parameters must be optional with default values.

Refs:
- #647
what:
- Add `{?socket_name}` to all 6 resource URI templates
- Add `socket_name: str | None = None` parameter to all resource handlers
- Pass socket_name through to `_get_server()` calls
- Update test URI template keys to match new templates

Closes #647
why: Existing resource tests used a MockMCP class that called handler
functions directly, never exercising FastMCP's URI routing, parameter
extraction, or MCP protocol handling. This meant transport-level bugs
(URI encoding, template matching, response formatting) couldn't be caught.

FastMCP's `Client(mcp)` enables in-process testing against the real MCP
protocol stack without network transport.

Refs:
- #649
what:
- Add test_resources_integration.py with parametrized ResourceIntegrationFixture
- 5 test cases covering sessions list, session detail, session windows,
  pane detail, and pane content via real Client(mcp).read_resource() calls
- Uses asyncio.run() wrapper to keep tests synchronous (no pytest-asyncio dep)

Closes #649
why: Terminal automation requires waiting for specific output to appear
(e.g., build completion, prompt return). Agents currently must poll
`capture_pane` repeatedly, consuming tokens and turns. A dedicated tool
saves both by encapsulating the poll loop server-side.

Uses libtmux's existing `retry_until` infrastructure (8s default timeout,
50ms poll interval) and the same regex pattern matching as `search_panes`.

Refs:
- #651
what:
- Add WaitForTextResult Pydantic model to models.py
- Add wait_for_text() tool using retry_until internally
- Returns structured result with found/timed_out/matched_lines/elapsed
- Tagged TAG_READONLY with ANNOTATIONS_RO (read-only, idempotent)
- Add parametrized WaitForTextFixture tests (found + timeout cases)
- Add test_wait_for_text_invalid_regex for bad pattern handling

Closes #651
why: Three independent reviews found converging correctness and safety
gaps: literal search was broken for regex-looking strings, destructive
tools allowed ambiguous fallback targeting, and untagged tools bypassed
the safety middleware.

what:
- Add regex=False param to search_panes and wait_for_text; default to
  literal matching via re.escape(), preserving regex opt-in
- Require exact IDs for destructive tools: kill_pane(pane_id),
  kill_window(window_id), kill_session(session_name|session_id)
- Add self-kill guard: refuse to kill the pane/window/session/server
  hosting this MCP server (detected via TMUX_PANE env var)
- Make safety middleware fail-closed: untagged tools are now denied
- Add FastMCP native visibility (mcp.enable(tags=..., only=True)) as
  primary gate alongside middleware for better error messages
- Log warning on invalid LIBTMUX_SAFETY env var instead of silent fallback
- Return EnvironmentResult Pydantic model from show_environment instead
  of raw JSON string
- Set on_duplicate="error" on FastMCP constructor
- Update all affected tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant