Skip to content

[6.1] Port Flaky Test Separation#4060

Open
benrr101 wants to merge 5 commits intorelease/6.1from
dev/russellben/flaky/6.1
Open

[6.1] Port Flaky Test Separation#4060
benrr101 wants to merge 5 commits intorelease/6.1from
dev/russellben/flaky/6.1

Conversation

@benrr101
Copy link
Copy Markdown
Contributor

Description

Since 6.1 branch already has the flaky test traits, this PR just updates the PR pipeline to run tests w/o flaky tests, then run only flaky tests, ignoring the results of the latter.

🤖

Codex ported these changes from the branch for #4059

Testing

These changes only apply to PR pipelines. They will be validated here.

@benrr101 benrr101 added this to the 6.1.5 milestone Mar 17, 2026
@benrr101 benrr101 requested a review from a team as a code owner March 17, 2026 22:30
Copilot AI review requested due to automatic review settings March 17, 2026 22:30
@benrr101 benrr101 added Area\Tests Issues that are targeted to tests or test projects Area\Engineering Use this for issues that are targeted for changes in the 'eng' folder or build systems. labels Mar 17, 2026
@github-project-automation github-project-automation bot moved this to To triage in SqlClient Board Mar 17, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the shared PR pipeline test template to run additional “flaky-only” Functional and Manual test passes (and ignore failures from those passes) to help separate flaky test failures from main PR signal.

Changes:

  • Adds “Run Flaky Functional Tests” and “Run Flaky Manual Tests” steps for both Windows (MSBuild@1) and non-Windows (DotNetCoreCLI@2).
  • Adjusts test-step conditions to succeededOrFailed() so later test steps still execute even when earlier steps fail.
  • Marks flaky-only steps with continueOnError: true to ignore flaky test failures.

@github-project-automation github-project-automation bot moved this from To triage to In progress in SqlClient Board Mar 18, 2026
Copilot AI review requested due to automatic review settings March 19, 2026 16:44
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to port the “flaky test separation” behavior into the 6.1 PR pipeline, so CI runs non-flaky tests normally and then runs flaky tests separately without failing the PR.

Changes:

  • Updates the shared run-all-tests-step.yml template to add dedicated “Run Flaky * Tests” steps that continue on error.
  • Adjusts pipeline step conditions to run subsequent test phases even if an earlier phase fails.
  • Updates ConnectionFailoverTests attributes/traits related to the flaky category.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
src/Microsoft.Data.SqlClient/tests/UnitTests/SimulatedServerTests/ConnectionFailoverTests.cs Trait/attribute adjustments for flaky categorization in this unit test class.
eng/pipelines/common/templates/steps/run-all-tests-step.yml Adds separate flaky test executions for unit/functional/manual tests and tweaks step conditions/behavior.

@paulmedynski paulmedynski self-assigned this Mar 19, 2026
@paulmedynski
Copy link
Copy Markdown
Contributor

@benrr101 - Code freeze for 6.1.5 will be Tue Apr 7. Will this change be complete by then?

- Add <Filter> MSBuild property with default 'category!=failing&category!=flaky'
  so that -p:Filter=... from the pipeline is honored by dotnet test
- Update all 6 test targets (Unit/Functional/Manual x Windows/Unix) to use
  $(Filter) instead of hard-coded filter strings
- Fix typo in non-Windows flaky unit test step: -p:=Test... -> -p:Test...
- Remove redundant [Trait("Category", "flaky")] on method already covered
  by class-level trait
paulmedynski
paulmedynski previously approved these changes Apr 6, 2026
Copilot AI review requested due to automatic review settings April 7, 2026 00:07
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Ports the “flaky test separation” approach to the 6.1 branch by wiring an MSBuild-level test filter and updating the PR pipeline to run non-flaky tests first, then flaky-only tests with failures ignored.

Changes:

  • Add a Filter MSBuild property in build.proj (defaulting to excluding flaky) and apply it across Unit/Functional/Manual test targets.
  • Update the shared pipeline test step template to run each test suite twice: non-flaky first, then flaky-only with continueOnError: true.
  • Mark the XEvents tracing manual test as flaky and add an xUnit fixture that attempts to clean up orphaned XEvent sessions.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/Microsoft.Data.SqlClient/tests/UnitTests/SimulatedServerTests/ConnectionFailoverTests.cs Keeps the class categorized as flaky; minor attribute ordering change.
src/Microsoft.Data.SqlClient/tests/ManualTests/TracingTests/XEventsTracingTest.cs Adds XEvent cleanup fixture and marks the test as flaky.
src/Microsoft.Data.SqlClient/tests/ManualTests/DataCommon/DataTestUtility.cs Adds a note about test-name parsing variability.
eng/pipelines/common/templates/steps/run-all-tests-step.yml Splits pipeline test execution into non-flaky and flaky-only runs (flaky failures ignored).
build.proj Introduces $(Filter) to control dotnet test --filter consistently across test targets.

Comment on lines +66 to +68
using SqlCommand dropCommand = new(
$"DROP EVENT SESSION [{sessionName}] ON DATABASE;",
connection);
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sessionName is interpolated into an identifier in DROP EVENT SESSION [...] without escaping. If a session name contains ] (or other special characters), this can break the statement and can potentially allow SQL injection via crafted object names. Use a safe identifier-quoting approach (e.g., QUOTENAME in dynamic SQL, or escape ] to ]] before bracket-quoting) before executing the DROP.

Copilot uses AI. Check for mistakes.
Comment on lines 99 to 103
_testName = DataTestUtility.CurrentTestName(outputHelper);
}

[Trait("Category", "flaky")]
[ConditionalTheory(typeof(DataTestUtility), nameof(DataTestUtility.AreConnStringsSetup), nameof(DataTestUtility.IsNotAzureSynapse), nameof(DataTestUtility.IsNotManagedInstance))]
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description says this change "just updates the PR pipeline", but this PR also changes test code here (adds a collection fixture and marks the test Category=flaky). Consider updating the PR description to reflect these additional behavioral changes so they aren’t overlooked during review/release notes.

Copilot uses AI. Check for mistakes.
Comment on lines +33 to +37
SELECT Sessions.name
FROM sys.database_event_sessions Sessions
LEFT JOIN sys.dm_xe_database_sessions Active ON Sessions.name = Active.name
WHERE Active.name IS NULL;
""",
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cleanup query selects all inactive/orphaned database XEvent sessions and will drop them. That can delete sessions not created by this test suite (e.g., user-created or intentionally stopped sessions) when manual tests are run against a shared Azure SQL database. Consider restricting the query to sessions created by these tests (e.g., a known name prefix/pattern).

Copilot uses AI. Check for mistakes.
Comment on lines +73 to +80
}
catch (SqlException)
{
// Ignore exceptions, as the session may have been cleaned up by another
// test run at the same time.
}

Console.WriteLine($" Dropped orphaned XEvent session: {sessionName}");
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code always logs "Dropped orphaned XEvent session" even when dropCommand.ExecuteNonQuery() throws (the log is outside the try/catch). This produces misleading output. Only log a successful drop when no exception occurred (or emit a different message on failure).

Suggested change
}
catch (SqlException)
{
// Ignore exceptions, as the session may have been cleaned up by another
// test run at the same time.
}
Console.WriteLine($" Dropped orphaned XEvent session: {sessionName}");
Console.WriteLine($" Dropped orphaned XEvent session: {sessionName}");
}
catch (SqlException)
{
// Ignore exceptions, as the session may have been cleaned up by another
// test run at the same time.
Console.WriteLine($" Could not drop orphaned XEvent session: {sessionName}");
}

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area\Engineering Use this for issues that are targeted for changes in the 'eng' folder or build systems. Area\Tests Issues that are targeted to tests or test projects

Projects

Status: In progress

Development

Successfully merging this pull request may close these issues.

4 participants