docs(flaky-tests): document monitor action type selection for failure rate and failure count monitors by samgutentag · Pull Request #655 · trunk-io/docs

samgutentag · 2026-05-13T18:08:26Z

Summary

Adds an Action Type section to both the Failure Rate Monitor and Failure Count Monitor pages explaining the two options: Classify test status vs. Apply labels
Moves Detection Type (flaky/broken) into context as a setting that applies only to classify-action monitors
Updates the configuration tables and subsections in both pages to reflect the new creation form flow
Links to the Test Labels page for the apply-labels path

Shipped in v174 via trunk-io/trunk2#3945. Pairs with #651 which documents automatic labeling from the test-labels side.

Test plan

Verify Action Type section appears in correct location on both pages
Verify Detection Type section is clearly scoped to classify-action monitors
Check links to test-labels.md#automatic-labeling-from-monitors resolve

Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com

When creating a failure rate or failure count monitor, users now choose an action type first -- either classify test status (flaky/broken) or apply labels. Add an Action Type section to both monitor pages explaining the two options and moving Detection Type into context as a classify-only setting. Shipped in v174 via trunk2#3945. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

trunk-staging-io · 2026-05-13T18:08:30Z

Merging to main in this repository is managed by Trunk.

To merge this pull request, check the box to the left or comment /trunk merge below.

After your PR is submitted to the merge queue, this comment will be automatically updated with its status. If the PR fails, failure details will also be posted here

claude · 2026-05-13T18:09:01Z

Claude finished @samgutentag's task in 1m 43s —— View job

Review

Read changed files
Check for typos, grammar, and formatting issues
Verify links resolve
Check structure and consistency between failure-rate and failure-count pages
Post review feedback

The new Action Type / Detection Type structure is solid and mirrors cleanly between the two pages. The link to test-labels.md#automatic-labeling-from-monitors resolves to an existing section. A few small things to address (posted as inline comments):

Frontmatter description inconsistency between the two pages. Only failure-count-monitor.md's description was updated to mention labeling; failure-rate-monitor.md's description still says "Detect flaky or broken tests…". Pick one direction and apply it to both. The new failure-count phrasing ("Detect and classify or label tests") also parses awkwardly — a small reword would help.
"classify-action" is a coined adjective that is never defined (appears once on each page right after the section's own scoping sentence). Since the section header already scopes the discussion, the qualifier is redundant — suggest simplifying.
Wording mismatch in the Action Type intros. The intro says "when a test is flagged" but the Apply labels bullet says "when the monitor activates." The labels path doesn't flag, so a more neutral verb ("matches a test") avoids implying that labeling produces a flag.

Nothing blocking — the substantive content is right and the changes are consistent with what's in trunk2#3945. Note that the companion update in test-labels.md (the "Coming soon" hint at lines 35-37) presumably gets removed in #651.

Out of scope but worth flagging separately: failure-count-monitor.md has two ## Preview Panel sections (lines 117 and 129). That predates this PR.
• Branch

claude · 2026-05-13T18:10:22Z

@@ -1,5 +1,5 @@
 ---
-description: Detect flaky or broken tests as soon as they accumulate a configured number of failures
+description: Detect and classify or label tests as soon as they accumulate a configured number of failures


Frontmatter inconsistency between the two pages.

This page's description was updated to mention labeling ("Detect and classify or label tests…") but failure-rate-monitor.md:2 was left as the original "Detect flaky or broken tests…" and never updated to reflect the new apply-labels action. Either both should mention the labels path or neither should — currently the two sibling pages disagree about whether labeling is part of the monitor's purpose.

Also, the new phrasing "Detect and classify or label tests" parses awkwardly. A cleaner version that mirrors the H1 ("Failure Count Monitor") without the parse-ambiguity:

Suggested change

description: Detect and classify or label tests as soon as they accumulate a configured number of failures

description: Classify or label tests as soon as they accumulate a configured number of failures

claude · 2026-05-13T18:10:27Z

-Each failure count monitor has a **detection type** -- either **flaky** or **broken** -- which controls what status a test receives when the monitor flags it:
+Applies only to monitors with the **Classify test status** action type.
+
+Each classify-action failure count monitor has a **detection type** -- either **flaky** or **broken** -- which controls what status a test receives when the monitor flags it:


"classify-action" reads as a coined term that is never defined.

The previous line already scopes this paragraph to monitors with the Classify test status action type, so the adjective here is redundant and a bit jargony — readers may wonder if "classify-action monitor" is a formal name they should know. The matching line in failure-rate-monitor.md:24 has the same phrasing.

Suggest dropping the qualifier since the section intro already establishes the scope:

Suggested change

Each classify-action failure count monitor has a **detection type** -- either **flaky** or **broken** -- which controls what status a test receives when the monitor flags it:

Each such failure count monitor has a **detection type** -- either **flaky** or **broken** -- which controls what status a test receives when the monitor flags it:

(And similarly on the failure-rate page.)

claude · 2026-05-13T18:10:31Z

+When creating a failure count monitor, choose what action it takes when a test is flagged:
+
+- **Classify test status** — marks the test as flaky or broken. This is the default and integrates with quarantine workflows and status-based filtering.
+- **Apply labels** — tags matching tests with one or more labels when the monitor activates. Use this when you want to categorize tests automatically without changing their status. See [Automatic labeling from monitors](../management/test-labels.md#automatic-labeling-from-monitors) for details.


Minor wording mismatch: the intro says "when a test is flagged" but the Apply labels bullet then says "when the monitor activates." The label path doesn't flag tests — it labels them — so "flagged" is misleading for one of the two options. Consider:

Suggested change

When creating a failure count monitor, choose what action it takes when a test is flagged:

- **Classify test status** — marks the test as flaky or broken. This is the default and integrates with quarantine workflows and status-based filtering.

- **Apply labels** — tags matching tests with one or more labels when the monitor activates. Use this when you want to categorize tests automatically without changing their status. See [Automatic labeling from monitors](../management/test-labels.md#automatic-labeling-from-monitors) for details.

When creating a failure count monitor, choose what action it takes when the monitor matches a test:

- **Classify test status** — marks the test as flaky or broken. This is the default and integrates with quarantine workflows and status-based filtering.

- **Apply labels** — tags matching tests with one or more labels when the monitor activates. Use this when you want to categorize tests automatically without changing their status. See [Automatic labeling from monitors](../management/test-labels.md#automatic-labeling-from-monitors) for details.

Same wording exists at failure-rate-monitor.md:13 and could be updated in parallel.

claude · 2026-05-13T18:10:37Z

 The failure rate monitor detects tests based on failure rate over a rolling time window. Unlike pass-on-retry, which looks for a specific pattern on a single commit, the failure rate monitor identifies tests that fail too often over a period of time, even if no individual failure looks like a retry.

-You can create multiple failure rate monitors with different configurations. This is how you tailor detection to different branches, test volumes, sensitivity levels, and detection types.
+You can create multiple failure rate monitors with different configurations. This is how you tailor detection to different branches, test volumes, sensitivity levels, and action types.


Related to the description-frontmatter comment on the failure-count page: this file's description (line 2, outside the diff) still reads "Detect flaky or broken tests based on failure rate over a configurable time window" and was not updated to mention the new apply-labels action. The sibling page's description was updated. For consistency, consider updating this page's description here too, or reverting the failure-count one — they should agree.

samgutentag · 2026-05-13T19:45:13Z

superseded by #656

claude Bot reviewed May 13, 2026

View reviewed changes

samgutentag closed this May 13, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs(flaky-tests): document monitor action type selection for failure rate and failure count monitors#655

docs(flaky-tests): document monitor action type selection for failure rate and failure count monitors#655
samgutentag wants to merge 1 commit into
mainfrom
sam-gutentag/monitor-action-types-v174

samgutentag commented May 13, 2026

Uh oh!

trunk-staging-io Bot commented May 13, 2026

Uh oh!

claude Bot commented May 13, 2026 •

edited

Loading

Uh oh!

claude Bot May 13, 2026

Uh oh!

claude Bot May 13, 2026

Uh oh!

claude Bot May 13, 2026

Uh oh!

claude Bot May 13, 2026

Uh oh!

samgutentag commented May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants

	description: Detect and classify or label tests as soon as they accumulate a configured number of failures
	description: Classify or label tests as soon as they accumulate a configured number of failures

	Each classify-action failure count monitor has a detection type -- either flaky or broken -- which controls what status a test receives when the monitor flags it:
	Each such failure count monitor has a detection type -- either flaky or broken -- which controls what status a test receives when the monitor flags it:

Conversation

samgutentag commented May 13, 2026

Summary

Test plan

Uh oh!

trunk-staging-io Bot commented May 13, 2026

Uh oh!

claude Bot commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review

Uh oh!

claude Bot May 13, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot May 13, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot May 13, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot May 13, 2026

Choose a reason for hiding this comment

Uh oh!

samgutentag commented May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants

claude Bot commented May 13, 2026 •

edited

Loading