[feature not live] docs(merge-queue): add Testing Duration chart and drill-down to metrics page by samgutentag · Pull Request #662 · trunk-io/docs

samgutentag · 2026-05-14T18:11:31Z

Adds documentation for two related Merge Queue metrics features shipped in v175.

Changes

Adds a Testing Duration section to the metrics page covering the chart, filters (Outcome, Cycle ended in), and statistical measures
Documents drill-down into individual test runs from the Testing Duration chart (trunk2 PR #3936)
Updates the "Drill down into metrics" section to include Testing Duration alongside Conclusion count and Time in queue
Removes the hint that drill-down was only available on Conclusion count and Time in queue

Context

Testing Duration chart: trunk2 PR #3919
Testing Duration drill-down: trunk2 PR #3936
Supersedes the stale sam-gutentag/mq-testing-duration-chart branch (PR [feature not live] docs(merge-queue): add Testing Duration chart to metrics page #647) which conflicted with multiple main changes

…to metrics page Documents two related features that ship in v175: - Testing Duration chart (trunk2 PR #3919 + context): shows per-testing-cycle durations with Outcome and Cycle ended in filters and statistical measures - Testing metrics drill-down (trunk2 PR #3936): clicking into a range on the Testing Duration chart shows individual PR test runs with duration, outcome, and cycle conclusion columns Also updates the "Drill down into metrics" intro to include Testing Duration alongside Conclusion count and Time in queue, and removes the hint that said drill-down was not yet available on additional charts. https://github.com/trunk-io/trunk2/pull/3936 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

trunk-staging-io · 2026-05-14T18:11:35Z

Merging to main in this repository is managed by Trunk.

To merge this pull request, check the box to the left or comment /trunk merge below.

After your PR is submitted to the merge queue, this comment will be automatically updated with its status. If the PR fails, failure details will also be posted here

claude · 2026-05-14T18:11:58Z

Claude finished @samgutentag's task in 1m 45s —— View job

PR review

Read the changed file
Analyze the diff for typos, grammar, and formatting
Check documentation best practices
Post review feedback

Summary

Solid additions overall — the section is well-structured, clearly distinguishes Testing duration from Time in queue, and properly updates the Drill-down section and removes the now-stale hint. A few prose-level issues are worth tightening before merge; none are blocking.

Inline comments posted

Line 110 — uses separate data bucketing is awkward, and the trailing hover sentence reads as a non-sequitur. Suggested a single combined sentence.
Line 115 — if its testing cycle restarted mixes present-tense conditional with past-tense verb; suggest was restarted.
Line 129 — preempt is being used as a noun in a list of nouns; should be preemption.
Line 130 — Spelling consistency: this PR adds ~6 instances of Cancelled (British), but the body of the doc primarily uses canceled (US). Pre-existing inconsistency, but worth standardizing.
Line 162 — Parallelism: tests passed, failed, interrupted, or were cancelled mixes voice. Suggested passed, failed, were interrupted, or were cancelled.
Line 163 — Terminology: the filter is Cycle ended in but the drill-down column is Cycle conclusion. Verify against the UI; align if they're the same.

Not flagged inline (FYI)

Heading case: existing #### Why Drill Down? and #### Select Data Points use Title Case, while the new #### Filters, #### Statistical measures, #### Drill down into individual test runs use sentence case. Pre-existing inconsistency — not introduced by this PR.
Table column-width padding on the new tables doesn't match the padded style used for the Time in queue table above. Renders identically; purely cosmetic.
· Branch

claude · 2026-05-14T18:13:27Z

+| All Outcomes | Include all testing cycles (default) |
+| Passed | Cycles where tests passed |
+| Failed | Cycles where tests failed |
+| Interrupted | Test runs cut short by a restart, preempt, or base-branch change |


Grammar: preempt is being used as a noun here, but it's a verb. Use preemption (or preemption event) for parallelism with the other nouns in the list.

Suggested change

| Interrupted | Test runs cut short by a restart, preempt, or base-branch change |

| Interrupted | Test runs cut short by a restart, preemption, or base-branch change |

claude · 2026-05-14T18:13:29Z

+
+* **PR number** — links directly to the pull request on GitHub.
+* **Testing duration** — how long that PR's testing cycle took.
+* **Outcome** — whether tests passed, failed, interrupted, or were cancelled.


Parallelism: the list mixes active and passive voice — tests passed, failed, interrupted, were cancelled. interrupted reads awkwardly as a passive without were. Restructure for consistent voice:

Suggested change

* **Outcome** — whether tests passed, failed, interrupted, or were cancelled.

* **Outcome** — whether tests passed, failed, were interrupted, or were cancelled.

claude · 2026-05-14T18:13:32Z


+### Testing duration
+
+Testing duration shows how long each PR spends in the Testing state within the Merge Queue — measured from when testing begins to when the testing cycle reaches its final outcome. Unlike the Conclusion count and Time in queue charts, testing duration uses separate data bucketing. Hovering over a data point does not highlight corresponding points on the other charts.


Clarity: uses separate data bucketing is awkward and a bit jargony, and the closing sentence about hover behavior is a non-sequitur after it. Consider rephrasing so the reader knows why this matters — that the time buckets don't sync up with the other charts:

Suggested change

Testing duration shows how long each PR spends in the Testing state within the Merge Queue — measured from when testing begins to when the testing cycle reaches its final outcome. Unlike the Conclusion count and Time in queue charts, testing duration uses separate data bucketing. Hovering over a data point does not highlight corresponding points on the other charts.

Testing duration shows how long each PR spends in the Testing state within the Merge Queue — measured from when testing begins to when the testing cycle reaches its final outcome. Unlike the Conclusion count and Time in queue charts, the Testing duration chart uses its own time buckets, so hovering over a data point does not highlight corresponding points on the other charts.

claude · 2026-05-14T18:13:35Z

+This is distinct from [Time in queue](#time-in-queue), which measures total time from queue entry to exit. A PR that waits before testing starts will have a longer time in queue but the same testing duration. Use this chart to understand CI performance specifically, separate from queue wait time.
+
+{% hint style="info" %}
+Each data point represents one testing-to-final-state transition. A single PR can contribute multiple data points if its testing cycle restarted.


Tense: if its testing cycle restarted mixes a present-tense conditional with a past-tense verb. Either present (restarts) or passive (was restarted) reads more cleanly:

Suggested change

Each data point represents one testing-to-final-state transition. A single PR can contribute multiple data points if its testing cycle restarted.

Each data point represents one testing-to-final-state transition. A single PR can contribute multiple data points if its testing cycle was restarted.

claude · 2026-05-14T18:13:37Z

+* **PR number** — links directly to the pull request on GitHub.
+* **Testing duration** — how long that PR's testing cycle took.
+* **Outcome** — whether tests passed, failed, interrupted, or were cancelled.
+* **Cycle conclusion** — the PR's overall outcome (Merged, Failed, Cancelled, or In Flight).


Terminology: the filter is documented as Cycle ended in (line 132), but the drill-down column here is Cycle conclusion. If those really are two different labels in the UI, this is fine — but worth double-checking, since readers will assume the column shows the value of the filter they just used. If the UI label is the same, align the docs.

claude · 2026-05-14T18:13:40Z

+| Passed | Cycles where tests passed |
+| Failed | Cycles where tests failed |
+| Interrupted | Test runs cut short by a restart, preempt, or base-branch change |
+| Cancelled | Cycles cancelled mid-test |


Spelling consistency: the new content uses Cancelled / cancelled (British) throughout, but the body of this doc primarily uses canceled / Canceled (US) — see lines 82, 88, 92, 177. Pre-existing inconsistency means this isn't a regression introduced by the PR, but since you're adding ~6 new instances, it might be worth standardizing in this PR (or a follow-up). canceled matches the bulk of the existing prose; cancelled matches the Prometheus metric label.

github-actions

Auto-approved: Claude code review passed.

claude Bot reviewed May 14, 2026

View reviewed changes

github-actions Bot approved these changes May 14, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feature not live] docs(merge-queue): add Testing Duration chart and drill-down to metrics page#662

[feature not live] docs(merge-queue): add Testing Duration chart and drill-down to metrics page#662
samgutentag wants to merge 1 commit into
mainfrom
sam-gutentag/mq-testing-duration-metrics-v175

samgutentag commented May 14, 2026

Uh oh!

trunk-staging-io Bot commented May 14, 2026

Uh oh!

claude Bot commented May 14, 2026 •

edited

Loading

Uh oh!

claude Bot May 14, 2026

Uh oh!

claude Bot May 14, 2026

Uh oh!

claude Bot May 14, 2026

Uh oh!

claude Bot May 14, 2026

Uh oh!

claude Bot May 14, 2026

Uh oh!

claude Bot May 14, 2026

Uh oh!

github-actions Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants

	\| Interrupted \| Test runs cut short by a restart, preempt, or base-branch change \|
	\| Interrupted \| Test runs cut short by a restart, preemption, or base-branch change \|

	* Outcome — whether tests passed, failed, interrupted, or were cancelled.
	* Outcome — whether tests passed, failed, were interrupted, or were cancelled.


		### Testing duration

		Testing duration shows how long each PR spends in the Testing state within the Merge Queue — measured from when testing begins to when the testing cycle reaches its final outcome. Unlike the Conclusion count and Time in queue charts, testing duration uses separate data bucketing. Hovering over a data point does not highlight corresponding points on the other charts.

	Each data point represents one testing-to-final-state transition. A single PR can contribute multiple data points if its testing cycle restarted.
	Each data point represents one testing-to-final-state transition. A single PR can contribute multiple data points if its testing cycle was restarted.

Conversation

samgutentag commented May 14, 2026

Changes

Context

Uh oh!

trunk-staging-io Bot commented May 14, 2026

Uh oh!

claude Bot commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR review

Summary

Inline comments posted

Not flagged inline (FYI)

Uh oh!

claude Bot May 14, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot May 14, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot May 14, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot May 14, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot May 14, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot May 14, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants

claude Bot commented May 14, 2026 •

edited

Loading