Skip to content

chore: align with katana-openapi-client safety + MCP fixes#60

Merged
dougborg merged 7 commits intomainfrom
chore/align-katana-safety-fixes
May 4, 2026
Merged

chore: align with katana-openapi-client safety + MCP fixes#60
dougborg merged 7 commits intomainfrom
chore/align-katana-safety-fixes

Conversation

@dougborg
Copy link
Copy Markdown
Owner

Summary

Catches statuspro-openapi-client up to recent developments in sister repo katana-openapi-client. Six commits, organized by theme:

Safety / process

  • dc3285e — pre-push guard hook blocking pushes that would land non-main commits on the main ref. Ports katana #434 (which followed a real release-pipeline incident: git push -u origin <branch> resolved through the local branch's origin/main upstream and shipped straight to main).
  • 481354b/open-pr skill mandates the safe refspec form (git push -u origin HEAD:refs/heads/<branch>); CLAUDE.md gains a worktree note + push-refspec entry in Known Pitfalls. Ports katana #441.

Test reliability

  • 5a6eb4f — observability timing tests now mock time.perf_counter for deterministic duration assertions (== exact). Eliminates the < 200ms upper-bound flake pattern that bit katana CI on Python 3.14 (235.7ms < 100ms). Test runtime drops from ~150ms to ~1ms. Ports katana #450 / #446 fix-up.

MCP UX

  • a7fe3cc — four Cancel buttons in Prefab preview UIs swap SendMessage("Cancel the X")ShowToast(...). No more synthetic user messages and LLM round-trips for a do-nothing action. Ports katana #440.
  • 3418251 — new list_coercion module recovers LLM-mistyped list inputs (CSV strings, JSON-stringified arrays) before pydantic. 19 unit tests; applied to seven LLM-facing list params (tags, tags_any, financial_status, fulfillment_status, order_ids×2, order_numbers). Ports katana #428 + the underlying #30f3fd86 fix.
  • 15f4a7c/simplify pass on the freshly-landed coercion module: drop redundant (JSONDecodeError, ValueError) tuple (subclass relationship), single-strip via walrus.

Skipped (with rationale)

  • katana #448 (deep-link embedding): needs StatusPro web-app URL scheme; not safely guessable from the codebase.
  • katana #446 (preview-fetches-backing-data + BLOCK marker): statuspro's preview path uses bare dicts, not Prefab branches with empty-render risk.
  • katana #420/#421 (spec drift audit tooling, ~1500 LOC): ROI question for statuspro's 7-endpoint surface vs. katana's 110 paths. Worth a smaller adaptation as a separate decision.
  • katana #456, #365, #406, #449 (typed-cache, EntitySpec, MO triage): katana-specific scale features that don't apply.

Test plan

  • uv run poe check — full validation (already green locally: 295 passed, 1 skipped)
  • Verify pre-push guard rejects refs/heads/<branch> → refs/heads/main:
    echo "refs/heads/test abc refs/heads/main def" | scripts/pre-push-guard.sh origin <url>
    
    expected: exit 1, helpful error suggesting HEAD:refs/heads/test form
  • Verify guard allows refs/heads/main → refs/heads/main and refs/heads/feat → refs/heads/feat: exit 0
  • Run only the new list-coercion tests: uv run pytest statuspro_mcp_server/tests/tools/test_list_coercion.py -v — 19 passed
  • Confirm timing tests are deterministic: uv run pytest statuspro_mcp_server/tests/test_observability_decorators.py::test_observe_tool_timing_accuracy statuspro_mcp_server/tests/test_observability_decorators.py::test_observe_service_timing_accuracy -v — runtime <100ms total

🤖 Generated with Claude Code

dougborg and others added 4 commits April 29, 2026 21:13
… branches

When a feature branch is created via ``git checkout -b <name> origin/main``,
git sets the new local branch's upstream to ``origin/main``. A subsequent
``git push -u origin <name>`` then resolves to the tracked upstream and pushes
straight to main — bypassing PR review entirely. Our sister repo
(katana-openapi-client commit 30f3fd86) hit exactly this in production: a
non-PR push to main triggered semantic-release and published an unintended
PyPI build before the pipeline could be cancelled.

Statuspro has no equivalent guard. Adds the same mechanical line of defense
katana ported in #434:

- ``scripts/pre-push-guard.sh`` — pre-commit-compatible pre-push hook that
  refuses pushes where ``remote_ref == refs/heads/main`` and ``local_ref !=
  refs/heads/main``. Suggests the safe form (``git push -u origin
  HEAD:refs/heads/<branch>``) in the rejection message.
- ``.pre-commit-config.yaml`` — adds ``default_install_hook_types:
  [pre-commit, pre-push]`` and registers ``block-push-to-main`` in the
  ``pre-push`` stage. Existing ``pre-commit install`` invocations now
  install both hook types.

Existing collaborators must re-run ``uv run pre-commit install`` once for
the pre-push hook to take effect; the README quick-start handles fresh
clones automatically.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pairs with the pre-push guard added in the previous commit. Documents the
push-refspec trap in two places that don't read CLAUDE.md mid-flow:

- ``.claude/skills/open-pr/SKILL.md`` CRITICAL block + Phase 5 — mandates
  ``git push -u origin HEAD:refs/heads/<branch>`` over the bare-branch form.
  The bare form resolves to the local branch's tracked upstream; if a
  branch was created via ``git checkout -b <name> origin/main`` the
  upstream is ``origin/main`` and the bare push targets main.
- ``CLAUDE.md`` Known Pitfalls — same explanation, plus a worktree note
  pointing out that pre-commit hooks (including the new pre-push guard)
  aren't shared across worktrees and require ``pre-commit install`` per
  worktree.

Lifted from katana-openapi-client's harness retro (PR #441) where the same
trap bit twice during PR development.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Both ``test_observe_service_timing_accuracy`` and
``test_observe_tool_timing_accuracy`` were measuring real time after
``await asyncio.sleep(...)`` and asserting the duration fell within
loose-lower-bound / tight-upper-bound brackets:

- tool variant: ``50 <= duration_ms < 200`` after a 100ms sleep
- service variant: ``50 <= duration_ms < 500`` after a 50ms sleep

The tool variant's ``< 200`` upper bound is the same flake pattern that
broke katana-openapi-client CI on Python 3.14 (235.7ms < 100ms in their
analogous service test, fixed in their #450 sweep). asyncio scheduler
+ GIL contention under CI runner load routinely pushes the measured
value well above tight upper bounds.

Both tests now mock ``time.perf_counter`` (the module-level binding the
decorator imports as ``time.perf_counter`` in ``statuspro_mcp.logging``)
with ``side_effect=[start, end]``. The computed duration is exact and
deterministic — no real sleep, no scheduler variance, exact-equality
assertion. The tests' name claims "timing accuracy" and now actually
tests that: the decorator computes ``(end - start) * 1000`` correctly.
Real asyncio scheduling latency was never the right thing to assert.

Test runtime drops from ~150ms (sleep-bound) to ~1ms (deterministic).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…trip)

Four Cancel buttons in Prefab preview UIs were sending fake user messages
back through the LLM ("Cancel the status change" / "Cancel the comment" /
"Cancel the due date change" / "Cancel the bulk update"). The model would
acknowledge in chat for what is functionally a "do nothing" action,
cluttering the conversation with a synthetic user message + LLM response.

Replaced with ``ShowToast`` (client-side only, no server trip per the
prefab_ui actions docstring): the user gets visible "Cancelled" feedback
in the iframe overlay, no chat message is appended, no LLM round-trip
happens, no tool is invoked.

Sites:
- ``build_status_change_preview_ui`` → "Status change cancelled"
- ``build_comment_preview_ui`` → "Comment cancelled"
- ``build_due_date_change_preview_ui`` → "Due date change cancelled"
- ``build_bulk_status_change_preview_ui`` → "Bulk update cancelled"

The two remaining ``SendMessage`` sites in ``prefab_ui.py`` (lines 232,
266) are intentional action triggers ("Add a comment to order...", etc.)
that prompt the LLM to follow up — those stay as is.

Ported from katana-openapi-client #440.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 30, 2026 03:27
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR ports several safety/process guardrails, test reliability fixes, and MCP UX improvements from katana-openapi-client into statuspro-openapi-client, primarily focused on preventing accidental pushes to main and making MCP tool interactions more robust and user-friendly.

Changes:

  • Add a pre-push guard (and hook installation defaults) to block accidental non-mainmain pushes; update contributor/docs skills to mandate safe push refspecs.
  • Make observability timing tests deterministic by mocking time.perf_counter (removing sleep/jitter flakes).
  • Improve MCP UX and input robustness: replace “Cancel” buttons’ synthetic messages with ShowToast, and add BeforeValidator-based coercion for LLM-mistyped list parameters (with unit tests) and apply it to Orders tools.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
statuspro_mcp_server/tests/tools/test_list_coercion.py Adds unit coverage for list-input coercion behavior across CSV/JSON-stringified inputs and error cases.
statuspro_mcp_server/tests/test_observability_decorators.py Makes timing assertions deterministic by patching perf_counter instead of using real sleeps.
statuspro_mcp_server/src/statuspro_mcp/tools/prefab_ui.py Swaps “Cancel” actions from SendMessage(...) to local ShowToast(...) to avoid unnecessary LLM round-trips.
statuspro_mcp_server/src/statuspro_mcp/tools/orders.py Applies Coerced* list parameter aliases to several LLM-facing list args (tags/status lists/order id lists).
statuspro_mcp_server/src/statuspro_mcp/tools/list_coercion.py Introduces the shared BeforeValidator + Coerced* aliases for recovering mistyped list inputs.
scripts/pre-push-guard.sh Adds a pre-push hook script that blocks non-main refs from being pushed to remote main.
CLAUDE.md Documents worktree hook behavior and the safe push refspec requirement / pitfall.
.pre-commit-config.yaml Installs pre-push hooks by default and registers the new block-push-to-main hook.
.claude/skills/open-pr/SKILL.md Updates the /open-pr workflow to require the explicit-destination push refspec.

Comment thread statuspro_mcp_server/src/statuspro_mcp/tools/orders.py Outdated
Comment thread statuspro_mcp_server/src/statuspro_mcp/tools/orders.py Outdated
dougborg and others added 2 commits April 29, 2026 21:34
LLMs occasionally serialize list-typed tool arguments as a single string
instead of a JSON array. Two shapes are observed in the wild:

1. CSV: ``order_ids='20486,20487,20488'``
2. JSON-stringified: ``order_ids='[20486, 20487, 20488]'``

Both make pydantic raise ``Input should be a valid list [type=list_type,
input_type=str]`` and the tool call aborts. The recovery is mechanical
and lossless — split on commas (or parse as JSON), strip whitespace,
hand pydantic a real list. So we do it.

Adds:

- ``statuspro_mcp/tools/list_coercion.py`` — a single ``coerce_str_list_input``
  BeforeValidator that handles both shapes. Lists pass through unchanged;
  non-string non-list inputs fall through to pydantic's normal type error
  so genuinely malformed input still surfaces loudly. Six type aliases
  (``CoercedStrList`` / ``CoercedIntList`` / ``CoercedStrIntList`` and
  their ``Opt`` variants) collapse the per-field
  ``Annotated[list[X] | None, BeforeValidator(coerce_str_list_input)]``
  boilerplate at call sites.
- 19 unit tests covering passthrough, CSV, JSON-array string, whitespace,
  empty inputs, non-string inputs, mixed-type lists, and the
  ``min_length`` interaction.

Applied to seven LLM-facing list parameters in ``orders.py``:

- ``list_orders``: tags, tags_any, financial_status, fulfillment_status
- ``get_orders_batch``: order_ids
- ``lookup_orders_batch``: order_numbers
- ``bulk_update_order_status``: order_ids

Internal/response-side list fields don't need it — pydantic-on-pydantic
round-trips already use real lists.

Ported from katana-openapi-client #428 (mechanical port of the alias-based
collapse already merged there).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two micro-simplifications surfaced by a /simplify review of the freshly-
landed list_coercion module:

- ``json.JSONDecodeError`` is a subclass of ``ValueError`` (verified at
  runtime). Drop the redundant tuple in the except clause; catch
  ``ValueError`` alone.
- The CSV fallback comprehension called ``item.strip()`` twice per item
  (once in the filter, once in the projection). Use a walrus assignment
  so the strip happens once: ``[stripped for item in s.split(",") if
  (stripped := item.strip())]``.

Behavior identical; all 19 unit tests still pass. Diverges from the
katana-openapi-client upstream by these two lines — re-sync if/when
katana picks up the same cleanup.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@dougborg dougborg merged commit a0a5a63 into main May 4, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants