Handle Numba cleanup crash in GitHub Actions workflow#28
Conversation
Modified test.yml to capture pytest output and ignore cleanup crashes (exit codes 134/139) when all tests actually pass. This handles the known Numba memory corruption issue during Python cleanup while still properly detecting actual test failures. Co-authored-by: deepentropy <8287111+deepentropy@users.noreply.github.com>
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
There was a problem hiding this comment.
Pull request overview
This PR addresses a GitHub Actions workflow failure caused by memory corruption during pytest/Numba cleanup that occurs after all 974 tests successfully pass. The approach captures pytest output and conditionally ignores specific cleanup-related exit codes (134/SIGABRT, 139/SIGSEGV) when tests have passed.
Key changes:
- Captures pytest output to files for post-execution analysis
- Implements exit code exception handling for cleanup crashes (codes 134 and 139)
- Applies the workaround to both "Run unit tests" and "Run tests with coverage" steps
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review for a chance to win a $100 gift card. Take the survey.
| PYTEST_EXIT_CODE=$? | ||
| set -e | ||
|
|
||
| if grep -qE "^=+ .* passed" pytest_cov_output.txt; then |
There was a problem hiding this comment.
[nitpick] Consider adding a comment explaining why specific exit codes (134 = SIGABRT, 139 = SIGSEGV) are being handled. This would improve maintainability and help future developers understand the workaround.
Example:
# Exit codes: 134 = SIGABRT (free() corruption), 139 = SIGSEGV (segfault)
# These indicate Numba cleanup crashes that occur after all tests pass
if [ "$PYTEST_EXIT_CODE" -eq 134 ] || [ "$PYTEST_EXIT_CODE" -eq 139 ]; then| if grep -qE "^=+ .* passed" pytest_cov_output.txt; then | |
| if grep -qE "^=+ .* passed" pytest_cov_output.txt; then | |
| # Exit codes: 134 = SIGABRT (free() corruption), 139 = SIGSEGV (segfault) | |
| # These indicate Numba cleanup crashes that occur after all tests pass. | |
| # If all tests passed, we ignore these cleanup crashes and mark the run as successful. |
| set -e | ||
|
|
||
| # Check if all tests passed by looking at the pytest summary line | ||
| if grep -qE "^=+ .* passed" pytest_output.txt; then |
There was a problem hiding this comment.
The grep pattern ^=+ .* passed will match pytest summary lines that contain both "passed" and "failed" (e.g., "5 passed, 10 failed"). This could cause the workflow to incorrectly treat partial failures as success when a cleanup crash occurs.
Consider using a more precise pattern that ensures ALL tests passed:
if grep -qE "^=+ [0-9]+ passed in " pytest_output.txt && ! grep -qE " failed" pytest_output.txt; thenOr check for the absence of failures:
if grep -qE "^=+ .* passed" pytest_output.txt && ! grep -qE "^=+ .* failed" pytest_output.txt; then| PYTEST_EXIT_CODE=$? | ||
| set -e | ||
|
|
||
| if grep -qE "^=+ .* passed" pytest_cov_output.txt; then |
There was a problem hiding this comment.
The grep pattern ^=+ .* passed will match pytest summary lines that contain both "passed" and "failed" (e.g., "5 passed, 10 failed"). This could cause the workflow to incorrectly treat partial failures as success when a cleanup crash occurs.
Consider using a more precise pattern that ensures ALL tests passed:
if grep -qE "^=+ [0-9]+ passed in " pytest_cov_output.txt && ! grep -qE " failed" pytest_cov_output.txt; thenOr check for the absence of failures:
if grep -qE "^=+ .* passed" pytest_cov_output.txt && ! grep -qE "^=+ .* failed" pytest_cov_output.txt; then| if grep -qE "^=+ .* passed" pytest_cov_output.txt; then | |
| if grep -qE "^=+ [0-9]+ passed in " pytest_cov_output.txt && ! grep -qE " failed" pytest_cov_output.txt; then |
| # Check if all tests passed by looking at the pytest summary line | ||
| if grep -qE "^=+ .* passed" pytest_output.txt; then | ||
| # If tests passed but exit code is non-zero (cleanup crash), treat as success | ||
| if [ "$PYTEST_EXIT_CODE" -eq 134 ] || [ "$PYTEST_EXIT_CODE" -eq 139 ]; then |
There was a problem hiding this comment.
[nitpick] Consider adding a comment explaining why specific exit codes (134 = SIGABRT, 139 = SIGSEGV) are being handled. This would improve maintainability and help future developers understand the workaround.
Example:
# Exit codes: 134 = SIGABRT (free() corruption), 139 = SIGSEGV (segfault)
# These indicate Numba cleanup crashes that occur after all tests pass
if [ "$PYTEST_EXIT_CODE" -eq 134 ] || [ "$PYTEST_EXIT_CODE" -eq 139 ]; then
All 974 tests pass, but the workflow fails with
free(): invalid next size (fast)during Python/Numba cleanup (exit code 134). Previous fix addingNUMBA_DISABLE_CACHING: 1did not resolve this.Changes
Applied to both "Run unit tests" and "Run tests with coverage" steps.
Original prompt
This pull request was created as a result of the following prompt from Copilot chat.
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.