[feature not live] docs(merge-queue): add Testing Duration chart and drill-down to metrics page#662
Conversation
…to metrics page Documents two related features that ship in v175: - Testing Duration chart (trunk2 PR #3919 + context): shows per-testing-cycle durations with Outcome and Cycle ended in filters and statistical measures - Testing metrics drill-down (trunk2 PR #3936): clicking into a range on the Testing Duration chart shows individual PR test runs with duration, outcome, and cycle conclusion columns Also updates the "Drill down into metrics" intro to include Testing Duration alongside Conclusion count and Time in queue, and removes the hint that said drill-down was not yet available on additional charts. https://github.com/trunk-io/trunk2/pull/3936 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Merging to
After your PR is submitted to the merge queue, this comment will be automatically updated with its status. If the PR fails, failure details will also be posted here |
|
Claude finished @samgutentag's task in 1m 45s —— View job PR review
SummarySolid additions overall — the section is well-structured, clearly distinguishes Testing duration from Time in queue, and properly updates the Drill-down section and removes the now-stale hint. A few prose-level issues are worth tightening before merge; none are blocking. Inline comments posted
Not flagged inline (FYI)
|
| | All Outcomes | Include all testing cycles (default) | | ||
| | Passed | Cycles where tests passed | | ||
| | Failed | Cycles where tests failed | | ||
| | Interrupted | Test runs cut short by a restart, preempt, or base-branch change | |
There was a problem hiding this comment.
Grammar: preempt is being used as a noun here, but it's a verb. Use preemption (or preemption event) for parallelism with the other nouns in the list.
| | Interrupted | Test runs cut short by a restart, preempt, or base-branch change | | |
| | Interrupted | Test runs cut short by a restart, preemption, or base-branch change | |
|
|
||
| * **PR number** — links directly to the pull request on GitHub. | ||
| * **Testing duration** — how long that PR's testing cycle took. | ||
| * **Outcome** — whether tests passed, failed, interrupted, or were cancelled. |
There was a problem hiding this comment.
Parallelism: the list mixes active and passive voice — tests passed, failed, interrupted, were cancelled. interrupted reads awkwardly as a passive without were. Restructure for consistent voice:
| * **Outcome** — whether tests passed, failed, interrupted, or were cancelled. | |
| * **Outcome** — whether tests passed, failed, were interrupted, or were cancelled. |
|
|
||
| ### Testing duration | ||
|
|
||
| Testing duration shows how long each PR spends in the Testing state within the Merge Queue — measured from when testing begins to when the testing cycle reaches its final outcome. Unlike the Conclusion count and Time in queue charts, testing duration uses separate data bucketing. Hovering over a data point does not highlight corresponding points on the other charts. |
There was a problem hiding this comment.
Clarity: uses separate data bucketing is awkward and a bit jargony, and the closing sentence about hover behavior is a non-sequitur after it. Consider rephrasing so the reader knows why this matters — that the time buckets don't sync up with the other charts:
| Testing duration shows how long each PR spends in the Testing state within the Merge Queue — measured from when testing begins to when the testing cycle reaches its final outcome. Unlike the Conclusion count and Time in queue charts, testing duration uses separate data bucketing. Hovering over a data point does not highlight corresponding points on the other charts. | |
| Testing duration shows how long each PR spends in the Testing state within the Merge Queue — measured from when testing begins to when the testing cycle reaches its final outcome. Unlike the Conclusion count and Time in queue charts, the Testing duration chart uses its own time buckets, so hovering over a data point does not highlight corresponding points on the other charts. |
| This is distinct from [Time in queue](#time-in-queue), which measures total time from queue entry to exit. A PR that waits before testing starts will have a longer time in queue but the same testing duration. Use this chart to understand CI performance specifically, separate from queue wait time. | ||
|
|
||
| {% hint style="info" %} | ||
| Each data point represents one testing-to-final-state transition. A single PR can contribute multiple data points if its testing cycle restarted. |
There was a problem hiding this comment.
Tense: if its testing cycle restarted mixes a present-tense conditional with a past-tense verb. Either present (restarts) or passive (was restarted) reads more cleanly:
| Each data point represents one testing-to-final-state transition. A single PR can contribute multiple data points if its testing cycle restarted. | |
| Each data point represents one testing-to-final-state transition. A single PR can contribute multiple data points if its testing cycle was restarted. |
| * **PR number** — links directly to the pull request on GitHub. | ||
| * **Testing duration** — how long that PR's testing cycle took. | ||
| * **Outcome** — whether tests passed, failed, interrupted, or were cancelled. | ||
| * **Cycle conclusion** — the PR's overall outcome (Merged, Failed, Cancelled, or In Flight). |
There was a problem hiding this comment.
Terminology: the filter is documented as Cycle ended in (line 132), but the drill-down column here is Cycle conclusion. If those really are two different labels in the UI, this is fine — but worth double-checking, since readers will assume the column shows the value of the filter they just used. If the UI label is the same, align the docs.
| | Passed | Cycles where tests passed | | ||
| | Failed | Cycles where tests failed | | ||
| | Interrupted | Test runs cut short by a restart, preempt, or base-branch change | | ||
| | Cancelled | Cycles cancelled mid-test | |
There was a problem hiding this comment.
Spelling consistency: the new content uses Cancelled / cancelled (British) throughout, but the body of this doc primarily uses canceled / Canceled (US) — see lines 82, 88, 92, 177. Pre-existing inconsistency means this isn't a regression introduced by the PR, but since you're adding ~6 new instances, it might be worth standardizing in this PR (or a follow-up). canceled matches the bulk of the existing prose; cancelled matches the Prometheus metric label.
Adds documentation for two related Merge Queue metrics features shipped in v175.
Changes
Context
sam-gutentag/mq-testing-duration-chartbranch (PR [feature not live] docs(merge-queue): add Testing Duration chart to metrics page #647) which conflicted with multiple main changesGenerated by Claude Code