chore: align with katana-openapi-client safety + MCP fixes#60
Merged
Conversation
… branches When a feature branch is created via ``git checkout -b <name> origin/main``, git sets the new local branch's upstream to ``origin/main``. A subsequent ``git push -u origin <name>`` then resolves to the tracked upstream and pushes straight to main — bypassing PR review entirely. Our sister repo (katana-openapi-client commit 30f3fd86) hit exactly this in production: a non-PR push to main triggered semantic-release and published an unintended PyPI build before the pipeline could be cancelled. Statuspro has no equivalent guard. Adds the same mechanical line of defense katana ported in #434: - ``scripts/pre-push-guard.sh`` — pre-commit-compatible pre-push hook that refuses pushes where ``remote_ref == refs/heads/main`` and ``local_ref != refs/heads/main``. Suggests the safe form (``git push -u origin HEAD:refs/heads/<branch>``) in the rejection message. - ``.pre-commit-config.yaml`` — adds ``default_install_hook_types: [pre-commit, pre-push]`` and registers ``block-push-to-main`` in the ``pre-push`` stage. Existing ``pre-commit install`` invocations now install both hook types. Existing collaborators must re-run ``uv run pre-commit install`` once for the pre-push hook to take effect; the README quick-start handles fresh clones automatically. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pairs with the pre-push guard added in the previous commit. Documents the push-refspec trap in two places that don't read CLAUDE.md mid-flow: - ``.claude/skills/open-pr/SKILL.md`` CRITICAL block + Phase 5 — mandates ``git push -u origin HEAD:refs/heads/<branch>`` over the bare-branch form. The bare form resolves to the local branch's tracked upstream; if a branch was created via ``git checkout -b <name> origin/main`` the upstream is ``origin/main`` and the bare push targets main. - ``CLAUDE.md`` Known Pitfalls — same explanation, plus a worktree note pointing out that pre-commit hooks (including the new pre-push guard) aren't shared across worktrees and require ``pre-commit install`` per worktree. Lifted from katana-openapi-client's harness retro (PR #441) where the same trap bit twice during PR development. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Both ``test_observe_service_timing_accuracy`` and ``test_observe_tool_timing_accuracy`` were measuring real time after ``await asyncio.sleep(...)`` and asserting the duration fell within loose-lower-bound / tight-upper-bound brackets: - tool variant: ``50 <= duration_ms < 200`` after a 100ms sleep - service variant: ``50 <= duration_ms < 500`` after a 50ms sleep The tool variant's ``< 200`` upper bound is the same flake pattern that broke katana-openapi-client CI on Python 3.14 (235.7ms < 100ms in their analogous service test, fixed in their #450 sweep). asyncio scheduler + GIL contention under CI runner load routinely pushes the measured value well above tight upper bounds. Both tests now mock ``time.perf_counter`` (the module-level binding the decorator imports as ``time.perf_counter`` in ``statuspro_mcp.logging``) with ``side_effect=[start, end]``. The computed duration is exact and deterministic — no real sleep, no scheduler variance, exact-equality assertion. The tests' name claims "timing accuracy" and now actually tests that: the decorator computes ``(end - start) * 1000`` correctly. Real asyncio scheduling latency was never the right thing to assert. Test runtime drops from ~150ms (sleep-bound) to ~1ms (deterministic). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…trip)
Four Cancel buttons in Prefab preview UIs were sending fake user messages
back through the LLM ("Cancel the status change" / "Cancel the comment" /
"Cancel the due date change" / "Cancel the bulk update"). The model would
acknowledge in chat for what is functionally a "do nothing" action,
cluttering the conversation with a synthetic user message + LLM response.
Replaced with ``ShowToast`` (client-side only, no server trip per the
prefab_ui actions docstring): the user gets visible "Cancelled" feedback
in the iframe overlay, no chat message is appended, no LLM round-trip
happens, no tool is invoked.
Sites:
- ``build_status_change_preview_ui`` → "Status change cancelled"
- ``build_comment_preview_ui`` → "Comment cancelled"
- ``build_due_date_change_preview_ui`` → "Due date change cancelled"
- ``build_bulk_status_change_preview_ui`` → "Bulk update cancelled"
The two remaining ``SendMessage`` sites in ``prefab_ui.py`` (lines 232,
266) are intentional action triggers ("Add a comment to order...", etc.)
that prompt the LLM to follow up — those stay as is.
Ported from katana-openapi-client #440.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR ports several safety/process guardrails, test reliability fixes, and MCP UX improvements from katana-openapi-client into statuspro-openapi-client, primarily focused on preventing accidental pushes to main and making MCP tool interactions more robust and user-friendly.
Changes:
- Add a
pre-pushguard (and hook installation defaults) to block accidental non-main→mainpushes; update contributor/docs skills to mandate safe push refspecs. - Make observability timing tests deterministic by mocking
time.perf_counter(removing sleep/jitter flakes). - Improve MCP UX and input robustness: replace “Cancel” buttons’ synthetic messages with
ShowToast, and addBeforeValidator-based coercion for LLM-mistyped list parameters (with unit tests) and apply it to Orders tools.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
statuspro_mcp_server/tests/tools/test_list_coercion.py |
Adds unit coverage for list-input coercion behavior across CSV/JSON-stringified inputs and error cases. |
statuspro_mcp_server/tests/test_observability_decorators.py |
Makes timing assertions deterministic by patching perf_counter instead of using real sleeps. |
statuspro_mcp_server/src/statuspro_mcp/tools/prefab_ui.py |
Swaps “Cancel” actions from SendMessage(...) to local ShowToast(...) to avoid unnecessary LLM round-trips. |
statuspro_mcp_server/src/statuspro_mcp/tools/orders.py |
Applies Coerced* list parameter aliases to several LLM-facing list args (tags/status lists/order id lists). |
statuspro_mcp_server/src/statuspro_mcp/tools/list_coercion.py |
Introduces the shared BeforeValidator + Coerced* aliases for recovering mistyped list inputs. |
scripts/pre-push-guard.sh |
Adds a pre-push hook script that blocks non-main refs from being pushed to remote main. |
CLAUDE.md |
Documents worktree hook behavior and the safe push refspec requirement / pitfall. |
.pre-commit-config.yaml |
Installs pre-push hooks by default and registers the new block-push-to-main hook. |
.claude/skills/open-pr/SKILL.md |
Updates the /open-pr workflow to require the explicit-destination push refspec. |
LLMs occasionally serialize list-typed tool arguments as a single string instead of a JSON array. Two shapes are observed in the wild: 1. CSV: ``order_ids='20486,20487,20488'`` 2. JSON-stringified: ``order_ids='[20486, 20487, 20488]'`` Both make pydantic raise ``Input should be a valid list [type=list_type, input_type=str]`` and the tool call aborts. The recovery is mechanical and lossless — split on commas (or parse as JSON), strip whitespace, hand pydantic a real list. So we do it. Adds: - ``statuspro_mcp/tools/list_coercion.py`` — a single ``coerce_str_list_input`` BeforeValidator that handles both shapes. Lists pass through unchanged; non-string non-list inputs fall through to pydantic's normal type error so genuinely malformed input still surfaces loudly. Six type aliases (``CoercedStrList`` / ``CoercedIntList`` / ``CoercedStrIntList`` and their ``Opt`` variants) collapse the per-field ``Annotated[list[X] | None, BeforeValidator(coerce_str_list_input)]`` boilerplate at call sites. - 19 unit tests covering passthrough, CSV, JSON-array string, whitespace, empty inputs, non-string inputs, mixed-type lists, and the ``min_length`` interaction. Applied to seven LLM-facing list parameters in ``orders.py``: - ``list_orders``: tags, tags_any, financial_status, fulfillment_status - ``get_orders_batch``: order_ids - ``lookup_orders_batch``: order_numbers - ``bulk_update_order_status``: order_ids Internal/response-side list fields don't need it — pydantic-on-pydantic round-trips already use real lists. Ported from katana-openapi-client #428 (mechanical port of the alias-based collapse already merged there). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two micro-simplifications surfaced by a /simplify review of the freshly-
landed list_coercion module:
- ``json.JSONDecodeError`` is a subclass of ``ValueError`` (verified at
runtime). Drop the redundant tuple in the except clause; catch
``ValueError`` alone.
- The CSV fallback comprehension called ``item.strip()`` twice per item
(once in the filter, once in the projection). Use a walrus assignment
so the strip happens once: ``[stripped for item in s.split(",") if
(stripped := item.strip())]``.
Behavior identical; all 19 unit tests still pass. Diverges from the
katana-openapi-client upstream by these two lines — re-sync if/when
katana picks up the same cleanup.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
15f4a7c to
3272e60
Compare
This was referenced Apr 30, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Catches statuspro-openapi-client up to recent developments in sister repo katana-openapi-client. Six commits, organized by theme:
Safety / process
dc3285e— pre-push guard hook blocking pushes that would land non-maincommits on themainref. Ports katana #434 (which followed a real release-pipeline incident:git push -u origin <branch>resolved through the local branch'sorigin/mainupstream and shipped straight to main).481354b—/open-prskill mandates the safe refspec form (git push -u origin HEAD:refs/heads/<branch>); CLAUDE.md gains a worktree note + push-refspec entry in Known Pitfalls. Ports katana #441.Test reliability
5a6eb4f— observability timing tests now mocktime.perf_counterfor deterministic duration assertions (==exact). Eliminates the< 200msupper-bound flake pattern that bit katana CI on Python 3.14 (235.7ms < 100ms). Test runtime drops from ~150ms to ~1ms. Ports katana #450 / #446 fix-up.MCP UX
a7fe3cc— four Cancel buttons in Prefab preview UIs swapSendMessage("Cancel the X")→ShowToast(...). No more synthetic user messages and LLM round-trips for a do-nothing action. Ports katana #440.3418251— newlist_coercionmodule recovers LLM-mistyped list inputs (CSV strings, JSON-stringified arrays) before pydantic. 19 unit tests; applied to seven LLM-facing list params (tags,tags_any,financial_status,fulfillment_status,order_ids×2,order_numbers). Ports katana #428 + the underlying #30f3fd86 fix.15f4a7c—/simplifypass on the freshly-landed coercion module: drop redundant(JSONDecodeError, ValueError)tuple (subclass relationship), single-strip via walrus.Skipped (with rationale)
Test plan
uv run poe check— full validation (already green locally: 295 passed, 1 skipped)refs/heads/<branch> → refs/heads/main:HEAD:refs/heads/testformrefs/heads/main → refs/heads/mainandrefs/heads/feat → refs/heads/feat: exit 0uv run pytest statuspro_mcp_server/tests/tools/test_list_coercion.py -v— 19 passeduv run pytest statuspro_mcp_server/tests/test_observability_decorators.py::test_observe_tool_timing_accuracy statuspro_mcp_server/tests/test_observability_decorators.py::test_observe_service_timing_accuracy -v— runtime <100ms total🤖 Generated with Claude Code