From a7f9cf450719efad121105915ea9bad09930f867 Mon Sep 17 00:00:00 2001 From: Claude Date: Wed, 13 May 2026 18:08:02 +0000 Subject: [PATCH] docs(flaky-tests): document monitor action type selection When creating a failure rate or failure count monitor, users now choose an action type first -- either classify test status (flaky/broken) or apply labels. Add an Action Type section to both monitor pages explaining the two options and moving Detection Type into context as a classify-only setting. Shipped in v174 via trunk2#3945. Co-Authored-By: Claude Opus 4.6 --- .../detection/failure-count-monitor.md | 26 ++++++++++++++++--- flaky-tests/detection/failure-rate-monitor.md | 23 +++++++++++++--- 2 files changed, 42 insertions(+), 7 deletions(-) diff --git a/flaky-tests/detection/failure-count-monitor.md b/flaky-tests/detection/failure-count-monitor.md index 89a17a5c..8f0ca675 100644 --- a/flaky-tests/detection/failure-count-monitor.md +++ b/flaky-tests/detection/failure-count-monitor.md @@ -1,5 +1,5 @@ --- -description: Detect flaky or broken tests as soon as they accumulate a configured number of failures +description: Detect and classify or label tests as soon as they accumulate a configured number of failures --- # Failure Count Monitor @@ -18,14 +18,25 @@ Use the failure count monitor when you want immediate visibility into test failu If you need to detect patterns of intermittent failure over time (e.g., a test that fails 20% of the time), use a [failure rate monitor](failure-rate-monitor.md) instead. If you want to catch tests that fail and then pass on retry within a single commit, [pass-on-retry](pass-on-retry-monitor.md) handles that automatically. +## Action Type + +When creating a failure count monitor, choose what action it takes when a test is flagged: + +- **Classify test status** — marks the test as flaky or broken. This is the default and integrates with quarantine workflows and status-based filtering. +- **Apply labels** — tags matching tests with one or more labels when the monitor activates. Use this when you want to categorize tests automatically without changing their status. See [Automatic labeling from monitors](../management/test-labels.md#automatic-labeling-from-monitors) for details. + +The action type is set at creation and cannot be changed afterward. If you need to switch a monitor's action type, create a new monitor with the desired type and disable the old one. + ## Detection Type -Each failure count monitor has a **detection type** -- either **flaky** or **broken** -- which controls what status a test receives when the monitor flags it: +Applies only to monitors with the **Classify test status** action type. + +Each classify-action failure count monitor has a **detection type** -- either **flaky** or **broken** -- which controls what status a test receives when the monitor flags it: - **Flaky monitors** are appropriate when failures on the monitored branch are likely non-deterministic. A test that fails once on `main` but passes on retry is probably flaky. - **Broken monitors** are appropriate when failures indicate a real regression. If a test fails on `main` and you expect it to keep failing until someone fixes it, broken is the right classification. -The detection type is set at creation and cannot be changed afterward. If you need to switch a monitor's type, create a new monitor with the desired type and disable the old one. +The detection type is set at creation and cannot be changed afterward. If you need to switch a monitor's detection type, create a new monitor with the desired type and disable the old one. ## How It Works @@ -37,6 +48,7 @@ You configure a failure count monitor with: | Setting | Value | |---|---| +| Action type | Classify test status | | Detection type | Broken | | Failure count | 1 | | Window | 30 minutes | @@ -56,6 +68,14 @@ If another test, `test_signup`, also failed during that window, it would be flag ## Configuration +### Action Type + +Choose **Classify test status** or **Apply labels**. See [Action Type](#action-type) above for details. This cannot be changed after the monitor is created. + +### Detection Type + +Appears only when the action type is **Classify test status**. Choose **Flaky** or **Broken**. This determines the status a test receives when the monitor flags it. See [Detection Type](#detection-type) above for guidance. + ### Failure Count The number of failures required to trigger detection. The default is **1**, meaning any single failure on a monitored branch flags the test. diff --git a/flaky-tests/detection/failure-rate-monitor.md b/flaky-tests/detection/failure-rate-monitor.md index 2abac7f3..68b05dd6 100644 --- a/flaky-tests/detection/failure-rate-monitor.md +++ b/flaky-tests/detection/failure-rate-monitor.md @@ -6,16 +6,27 @@ description: Detect flaky or broken tests based on failure rate over a configura The failure rate monitor detects tests based on failure rate over a rolling time window. Unlike pass-on-retry, which looks for a specific pattern on a single commit, the failure rate monitor identifies tests that fail too often over a period of time, even if no individual failure looks like a retry. -You can create multiple failure rate monitors with different configurations. This is how you tailor detection to different branches, test volumes, sensitivity levels, and detection types. +You can create multiple failure rate monitors with different configurations. This is how you tailor detection to different branches, test volumes, sensitivity levels, and action types. + +## Action Type + +When creating a failure rate monitor, choose what action it takes when a test is flagged: + +- **Classify test status** — marks the test as flaky or broken. This is the default and integrates with quarantine workflows and status-based filtering. +- **Apply labels** — tags matching tests with one or more labels when the monitor activates. Use this when you want to categorize tests automatically without changing their status. See [Automatic labeling from monitors](../management/test-labels.md#automatic-labeling-from-monitors) for details. + +The action type is set at creation and cannot be changed afterward. If you need to switch a monitor's action type, create a new monitor with the desired type and disable the old one. ## Detection Type -Each failure rate monitor has a **detection type** — either **flaky** or **broken** — which controls what status a test receives when the monitor flags it: +Applies only to monitors with the **Classify test status** action type. + +Each classify-action failure rate monitor has a **detection type** — either **flaky** or **broken** — which controls what status a test receives when the monitor flags it: - **Flaky monitors** catch tests that fail intermittently (e.g., 20–50% failure rate). These are typically caused by timing issues, shared state, or non-deterministic behavior. - **Broken monitors** catch tests that fail consistently at a high rate (e.g., 80%+ failure rate). These usually indicate a real regression — something in the code or environment is genuinely broken and needs a fix. -The detection type is set at creation and cannot be changed afterward. If you need to switch a monitor's type, create a new monitor with the desired type and disable the old one. +The detection type is set at creation and cannot be changed afterward. If you need to switch a monitor's detection type, create a new monitor with the desired type and disable the old one. This distinction matters because the two problems call for different responses. Flaky tests might be quarantined while you investigate the root cause. Broken tests represent real failures that should be fixed, not hidden. @@ -53,9 +64,13 @@ stale timeout, and branch scope. Capture it with realistic example values filled in (e.g., "Broken on main", Broken detection type, 80% activation, 60% resolution, 6 hour window, 50 min sample, main branch). --> +### Action Type + +Choose **Classify test status** or **Apply labels**. See [Action Type](#action-type) above for details. This cannot be changed after the monitor is created. + ### Detection Type -Choose **Flaky** or **Broken**. This determines the status a test receives when the monitor flags it. See [Detection Type](#detection-type) above for guidance on which to use. +Appears only when the action type is **Classify test status**. Choose **Flaky** or **Broken**. This determines the status a test receives when the monitor flags it. See [Detection Type](#detection-type) above for guidance on which to use. ### Activation Threshold