Skip to content

chore: delete flaky tests for now#13711

Merged
nomeata merged 1 commit into
masterfrom
joachim/delete-bad-tests
May 11, 2026
Merged

chore: delete flaky tests for now#13711
nomeata merged 1 commit into
masterfrom
joachim/delete-bad-tests

Conversation

@nomeata
Copy link
Copy Markdown
Collaborator

@nomeata nomeata commented May 11, 2026

This PR deletes two tests that sometimes timeout (or crash, unclear
without #13710) and I was not able to fix it by EOD.

This PR deletes two tests that sometimes timeout (or crash, unclear
without #13710) and I was not able to fix it by EOD.
@nomeata nomeata enabled auto-merge May 11, 2026 19:18
@nomeata nomeata added this pull request to the merge queue May 11, 2026
Merged via the queue into master with commit d055778 May 11, 2026
17 checks passed
nomeata added a commit that referenced this pull request May 12, 2026
This PR restores the `cancellation_empty_by.lean` and
`cancellation_par.lean` server-interactive tests that were deleted in
#13711 ("delete flaky tests for now"). With #13710 in place, a hung
worker now surfaces as a prompt `waitForMessage` abort rather than a
1500s CTest timeout, so the tests are tractable to keep enabled.

Trim diagnostic tracing in `cancellation_empty_by.lean` to the
minimum that's actually causally ordered against the test's
synchronisation points:

* `test: imports done` -- synchronous `#eval` at the top of the file,
  fires once before any async task is spawned.
* `tracerSuggestion ready` -- gated to the first `tracerSuggestion`
  invocation via `mkTestTask`, fires exactly once.
* `cancelTokenSet` -- inside the `cancelTk.onSet` callback, fires
  exactly once when `cancelRec` reaches the snapshot.
* `sync received` (x2) -- in `t1`'s body after `wait_for_sync`
  returns, once per elaboration.

Removed traces that were either every-invocation (and therefore raced
non-deterministically with the snapshot task's other output) or that
fired at command-elab boundaries where their position relative to
async stderr buffers was unstable: `tracerSuggestion: entered`,
`tracerSuggestion: returning candidate`, `t1: body entered`,
`test: before/after empty-by example`, `test: file end`.

Also drop the `LEAN_DEBUG_MAX_WORKERS` runtime hack added during
investigation: it served its purpose (confirming the
wait-pressure-driven worker pool ratchet in `task_manager::wait_for`)
but is not appropriate as a long-lived debug knob in the runtime.

Stress: 24/24 parallel runs pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant