Skip to content

Bytecode parity - for-loop cleanup, return duplication, late jump threading fix#7580

Open
youknowone wants to merge 6 commits intoRustPython:mainfrom
youknowone:bytecode-parity-jump
Open

Bytecode parity - for-loop cleanup, return duplication, late jump threading fix#7580
youknowone wants to merge 6 commits intoRustPython:mainfrom
youknowone:bytecode-parity-jump

Conversation

@youknowone
Copy link
Copy Markdown
Member

@youknowone youknowone commented Apr 10, 2026

  • Use POP_TOP instead of POP_ITER for for-loop break/return cleanup
  • Expand duplicate_end_returns to clone final return for jump predecessors
  • Restrict late jump threading pass to unconditional jumps only
  • Skip exception blocks in inline/reorder passes
  • Simplify threaded_jump_instr NoInterrupt handling

Summary by CodeRabbit

  • New Features

    • AST objects gain a public deepcopy for reliable deep copies.
  • Bug Fixes

    • More robust traceback generation when location data is missing.
    • Fixes to generator/coroutine finalization and frame cleanup.
    • Corrected comparison behavior to honor boolean coercion where expected.
    • Safer OS write loop handling of interruptions.
  • Performance

    • Improved bytecode control-flow optimizations and jump threading.
  • Tests

    • Added threading-related tests exercising threaded imports and exec.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 10, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Refines codegen and IR control flow, updates COMPARE_OP encoding/formatting, adjusts VM frame/coroutine lifecycle and traceback handling, adds AST deepcopy, centralizes thread stack-size logic and threading tests, and updates disassembly to emit raw instruction streams with cache-awareness.

Changes

Cohort / File(s) Summary
Codegen: loop/exception cleanup & control-flow
crates/codegen/src/compile.rs
Replaced PopIter with PopTop for several loop-cleanup paths, made try/except/orelse/with fallthroughs emit explicit no-interrupt jumps, and adjusted yield-from fail-path control flow.
IR: CFG rewrite, jump-threading & peepholes
crates/codegen/src/ir.rs
Added reorder_jump_over_exception_cleanup_blocks pass, enhanced jump-threading with JumpThreadKind and threaded_jump_instr, added exception-block classification helpers, and refined duplicate-return/jump-cloning logic.
COMPARE_OP encoding & disassembly
crates/compiler-core/src/bytecode/oparg.rs, crates/compiler-core/src/bytecode/instruction.rs, scripts/dis_dump.py
Introduced COMPARE_OP_BOOL_MASK and Display for ComparisonOperator, changed CompareOp disassembly to show bool(...) when masked, and updated disassembly extraction to emit raw instruction streams with synthetic CACHE entries and new jump target math.
Frame compare & traceback handling
crates/vm/src/frame.rs, crates/vm/src/stdlib/sys/monitoring.rs
Moved compare decoding into execute_compare (accepting raw OpArg) and honor bool-mask coercion; made traceback/line-location lookups resilient to missing location entries and switched instrumentation to use linetable-derived per-instruction locations.
Coroutine/generator lifecycle
crates/vm/src/coroutine.rs
Made finalization conditional on whether frame was entered, propagate entered_frame from run_with_context, clear locals/stack on return/error, and adjust send/throw/close semantics accordingly.
Thread stack & stdlib threading
crates/vm/src/stdlib/_thread.rs, crates/vm/src/vm/mod.rs, extra_tests/snippets/stdlib_threading.py
Added apply_thread_stack_size helper and VirtualMachine::current_c_stack_size() helpers, switched thread-local keying to u64, and added threaded import/exec tests.
AST deepcopy & constructor behavior
crates/vm/src/stdlib/_ast/python.rs
Added AST.__deepcopy__ (memo validation, per-node memoing) and removed singleton-return optimization so constructors always allocate fresh instances.
OS write & other small fixes
crates/vm/src/stdlib/os.rs, scripts/dis_dump.py (also listed above)
Fixed os._os.write to materialize data into an owned Vec<u8> before repeated writes to avoid borrowed-slice lifetime issues; adjusted disassembly handling as noted.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Poem

🐰 I hopped through CFGs with glee,
Swapped PopIter for PopTop, whee!
I threaded jumps, fixed compare's mask,
Deepcopied ASTs for every task,
Tuned stacks and tests — hooray for me! 🎩

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title clearly and concisely summarizes the main changes: for-loop cleanup instruction changes (POP_TOP vs POP_ITER), return duplication for jump predecessors, and jump threading fix.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ShaharNaveh
Copy link
Copy Markdown
Contributor

ShaharNaveh commented Apr 11, 2026

@youknowone
reviewdog has the following error:

time=2026-04-10T11:11:07.604Z level=INFO msg="reviewdog: failed to post a review comment: POST https://api.github.com/repos/RustPython/RustPython/pulls/7580/reviews: 403 Resource not accessible by integration []"
reviewdog: This GitHub Token doesn't have write permission of Review API [1],
so reviewdog will report results via logging command [2] and create annotations similar to
github-pr-check reporter as a fallback.
[1]: https://docs.github.com/en/actions/reference/events-that-trigger-workflows#pull_request_target
[2]: https://docs.github.com/en/actions/using-workflows/workflow-commands-for-github-actions
reviewdog: found at least one issue with severity greater than or equal to the given level: error

This has something to do with the repository settings, can you please take a look? then it will be able to make committable comments with auto-fixes. It works on forks, I guess there's some default setting that was changed

@youknowone youknowone force-pushed the bytecode-parity-jump branch from a720993 to a082b4c Compare April 11, 2026 13:59
@youknowone
Copy link
Copy Markdown
Member Author

posting requires write permission. But the permission is read-only

@youknowone youknowone marked this pull request as ready for review April 11, 2026 14:59
emit!(self, Instruction::PopTop);
} else {
emit!(self, Instruction::PopIter);
emit!(self, Instruction::PopTop);
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if both node of branch PopTop, why not out of branch?

const TIMEOUT_MAX: f64 = (TIMEOUT_MAX_IN_MICROSECONDS / 1_000_000) as f64;
const TIMEOUT_MAX_SECONDS: f64 = (TIMEOUT_MAX_IN_MICROSECONDS / 1_000_000) as f64;

#[pyattr(name = "TIMEOUT_MAX")]
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the point of this change? why not const ?

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/vm/src/coroutine.rs (1)

76-88: ⚠️ Potential issue | 🔴 Critical

Guard cleanup from preflight already executing errors before clearing frame state

maybe_close() currently treats all Err(_) as terminal. When run_with_context() fails its preflight check (already executing), finalize_send_result() still calls maybe_close(), and Line 87 now clears locals/stack while the frame may still be running. That violates clear_locals_and_stack()’s safety assumption (“frame is not executing”) and can corrupt execution state.

Proposed fix
 fn maybe_close(&self, res: &PyResult<ExecutionResult>) {
+    // `run_with_context` can return preflight errors (e.g. "already executing")
+    // without entering/resuming the frame. Do not close/clear in that case.
+    if self.running.load() {
+        return;
+    }
     match res {
         Ok(ExecutionResult::Return(_)) | Err(_) => {
             self.closed.store(true);
             // Frame is no longer suspended; allow frame.clear() to succeed.
             self.frame.owner.store(
                 FrameOwner::FrameObject as i8,
                 core::sync::atomic::Ordering::Release,
             );
             // Completed generators/coroutines should not keep their locals
             // alive while the wrapper object itself remains referenced.
             self.frame.clear_locals_and_stack();
         }
         Ok(ExecutionResult::Yield(_)) => {}
     }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/vm/src/coroutine.rs` around lines 76 - 88, maybe_close() currently
treats any Err(_) as terminal and clears the frame even for preflight "already
executing" failures; update maybe_close(&self, res: &PyResult<ExecutionResult>)
to detect the specific preflight/already-executing error returned by
run_with_context()/finalize_send_result() and skip the close/clear sequence for
that case. Concretely, pattern-match Err(e) and if e corresponds to the
already-executing/preflight error (use the same error type/variant or message
check used by run_with_context), simply return without calling
self.closed.store(...), self.frame.owner.store(...), or
self.frame.clear_locals_and_stack(); only perform the closing/clearing for true
terminal results (Ok(ExecutionResult::Return(_)) or other non-preflight Err
cases).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/codegen/src/ir.rs`:
- Around line 3676-3678: Do not count old_next as a predecessor because the
cloned block is jump-only; remove (or disable) the increment of
predecessors[old_next.idx()] (the BlockIdx::NULL check may remain) so
propagate_line_numbers won't treat old_next as having an extra predecessor;
update the logic around old_next in the same block that manipulates predecessors
to avoid adding a fall-through edge for jump-only cloned blocks referenced by
old_next.

---

Outside diff comments:
In `@crates/vm/src/coroutine.rs`:
- Around line 76-88: maybe_close() currently treats any Err(_) as terminal and
clears the frame even for preflight "already executing" failures; update
maybe_close(&self, res: &PyResult<ExecutionResult>) to detect the specific
preflight/already-executing error returned by
run_with_context()/finalize_send_result() and skip the close/clear sequence for
that case. Concretely, pattern-match Err(e) and if e corresponds to the
already-executing/preflight error (use the same error type/variant or message
check used by run_with_context), simply return without calling
self.closed.store(...), self.frame.owner.store(...), or
self.frame.clear_locals_and_stack(); only perform the closing/clearing for true
terminal results (Ok(ExecutionResult::Return(_)) or other non-preflight Err
cases).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 493c9d45-8b4b-48a5-bbcb-4b87d10ecb4c

📥 Commits

Reviewing files that changed from the base of the PR and between 9cf7bcd and a082b4c.

⛔ Files ignored due to path filters (4)
  • crates/codegen/src/snapshots/rustpython_codegen__compile__tests__if_ands.snap is excluded by !**/*.snap
  • crates/codegen/src/snapshots/rustpython_codegen__compile__tests__if_mixed.snap is excluded by !**/*.snap
  • crates/codegen/src/snapshots/rustpython_codegen__compile__tests__if_ors.snap is excluded by !**/*.snap
  • crates/codegen/src/snapshots/rustpython_codegen__compile__tests__nested_double_async_with.snap is excluded by !**/*.snap
📒 Files selected for processing (8)
  • crates/codegen/src/compile.rs
  • crates/codegen/src/ir.rs
  • crates/compiler-core/src/bytecode/instruction.rs
  • crates/compiler-core/src/bytecode/oparg.rs
  • crates/vm/src/coroutine.rs
  • crates/vm/src/frame.rs
  • crates/vm/src/stdlib/_thread.rs
  • scripts/dis_dump.py

…eading fix

- Use POP_TOP instead of POP_ITER for for-loop break/return cleanup
- Expand duplicate_end_returns to clone final return for jump predecessors
- Restrict late jump threading pass to unconditional jumps only
- Skip exception blocks in inline/reorder passes
- Simplify threaded_jump_instr NoInterrupt handling
@youknowone youknowone force-pushed the bytecode-parity-jump branch from a082b4c to 41e6ffe Compare April 11, 2026 16:09
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 11, 2026

📦 Library Dependencies

The following Lib/ modules were modified. Here are their dependencies:

(module 'dis test_yield_from' not found)

Legend:

  • [+] path exists in CPython
  • [x] up-to-date, [ ] outdated

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/vm/src/stdlib/sys/monitoring.rs (1)

394-403: ⚠️ Potential issue | 🟡 Minor

Add PopTop to the excluded opcodes list in the monitoring instrumentation pass, or verify synthetic cleanup instructions are marked NO_LOCATION.

Lines 400, 465, and 487 exclude PopIter from INSTRUMENTED_LINE tracking, but this PR replaces loop break/return cleanup from PopIter to PopTop (lines 1520, 9933). The synthetic PopTop instructions are emitted without set_no_location() calls (unlike other synthetic instructions in exception handling), which means they inherit the break/return statement's location and will be included in INSTRUMENTED_LINE events. Either add PopTop to the exclusion patterns at these three locations, or confirm and document that all synthetic cleanup PopTop emissions are guaranteed to be marked as NO_LOCATION in the linetable.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/vm/src/stdlib/sys/monitoring.rs` around lines 394 - 403, The
monitoring instrumentation currently excludes PopIter from INSTRUMENTED_LINE
tracking but not PopTop, which causes synthetic PopTop cleanup to be
instrumented; update the monitoring instrumentation pass to also exclude
Instruction::PopTop wherever Instruction::PopIter is excluded (the three
exclusion match sites in monitoring.rs) so PopTop is omitted from
INSTRUMENTED_LINE events, or alternatively ensure every synthetic emission of
PopTop calls set_no_location() at the emitter site so it carries NO_LOCATION;
reference Instruction::PopTop, Instruction::PopIter, INSTRUMENTED_LINE, and
set_no_location() when making the change.
♻️ Duplicate comments (1)
crates/codegen/src/ir.rs (1)

3767-3769: ⚠️ Potential issue | 🟡 Minor

Don’t count old_next as a predecessor here.

The cloned block is jump-only, so it never falls through to old_next. Incrementing this count makes later line-number propagation treat old_next as multi-predecessor and can suppress location propagation unnecessarily.

Suggested fix
         predecessors[target.idx()] -= 1;
         predecessors.push(1);
-        if old_next != BlockIdx::NULL {
-            predecessors[old_next.idx()] += 1;
-        }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/codegen/src/ir.rs` around lines 3767 - 3769, The code increments
predecessors[old_next.idx()] for old_next when old_next != BlockIdx::NULL, but
since the cloned block is jump-only and never falls through to old_next this
incorrectly marks old_next as having an extra predecessor; remove the increment
of predecessors for old_next in this branch (i.e., do not modify the
predecessors array for old_next in the cloned-block path) so that later
line-number propagation treats old_next correctly as a single-predecessor block;
locate the logic around old_next, predecessors, and BlockIdx::NULL in the
function handling the cloned block and delete or skip the
predecessors[old_next.idx()] += 1 operation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/codegen/src/ir.rs`:
- Around line 3323-3325: The helper block_is_exceptional currently only checks
Block.except_handler and Block.preserve_lasti and thus misses cleanup-only
unwind blocks (e.g., POP_EXCEPT ... RERAISE), allowing exception-cleanup opcodes
to be moved into normal flow; update block_is_exceptional to also consider
is_exception_cleanup_block() (or inline that predicate) so it returns true for
cleanup-only unwind blocks, and/or replace calls to block_is_exceptional at the
callers inline_small_or_no_lineno_blocks and the two conditional reorder passes
with a combined predicate that ORs is_exception_cleanup_block, ensuring those
passes treat cleanup-only unwind blocks as exceptional.

In `@crates/vm/src/coroutine.rs`:
- Around line 160-170: The running.load() precheck currently races with frame
mutation because locals_to_fast() is called before acquiring the exclusive
running guard; move the frame mutation inside the critical section by calling
locals_to_fast() only after the guard is acquired (i.e. inside the closure
passed to run_with_context or after compare_exchange succeeds), so the frame
state is touched while running is true. Update both the coroutine.resume path
(where locals_to_fast() is called before run_with_context/compare_exchange) and
the send() path to perform locals_to_fast() after entering the run_with_context
closure (or immediately after successfully setting running) and keep
finalize_send_result/result handling unchanged. Ensure you reference the running
flag access, locals_to_fast(), run_with_context(), resume(), and send() when
applying the change.

---

Outside diff comments:
In `@crates/vm/src/stdlib/sys/monitoring.rs`:
- Around line 394-403: The monitoring instrumentation currently excludes PopIter
from INSTRUMENTED_LINE tracking but not PopTop, which causes synthetic PopTop
cleanup to be instrumented; update the monitoring instrumentation pass to also
exclude Instruction::PopTop wherever Instruction::PopIter is excluded (the three
exclusion match sites in monitoring.rs) so PopTop is omitted from
INSTRUMENTED_LINE events, or alternatively ensure every synthetic emission of
PopTop calls set_no_location() at the emitter site so it carries NO_LOCATION;
reference Instruction::PopTop, Instruction::PopIter, INSTRUMENTED_LINE, and
set_no_location() when making the change.

---

Duplicate comments:
In `@crates/codegen/src/ir.rs`:
- Around line 3767-3769: The code increments predecessors[old_next.idx()] for
old_next when old_next != BlockIdx::NULL, but since the cloned block is
jump-only and never falls through to old_next this incorrectly marks old_next as
having an extra predecessor; remove the increment of predecessors for old_next
in this branch (i.e., do not modify the predecessors array for old_next in the
cloned-block path) so that later line-number propagation treats old_next
correctly as a single-predecessor block; locate the logic around old_next,
predecessors, and BlockIdx::NULL in the function handling the cloned block and
delete or skip the predecessors[old_next.idx()] += 1 operation.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: e402ee32-9ee5-4d0e-97b3-8782fee21d7a

📥 Commits

Reviewing files that changed from the base of the PR and between a082b4c and 41e6ffe.

⛔ Files ignored due to path filters (6)
  • Lib/test/test_dis.py is excluded by !Lib/**
  • Lib/test/test_yield_from.py is excluded by !Lib/**
  • crates/codegen/src/snapshots/rustpython_codegen__compile__tests__if_ands.snap is excluded by !**/*.snap
  • crates/codegen/src/snapshots/rustpython_codegen__compile__tests__if_mixed.snap is excluded by !**/*.snap
  • crates/codegen/src/snapshots/rustpython_codegen__compile__tests__if_ors.snap is excluded by !**/*.snap
  • crates/codegen/src/snapshots/rustpython_codegen__compile__tests__nested_double_async_with.snap is excluded by !**/*.snap
📒 Files selected for processing (9)
  • crates/codegen/src/compile.rs
  • crates/codegen/src/ir.rs
  • crates/compiler-core/src/bytecode/instruction.rs
  • crates/compiler-core/src/bytecode/oparg.rs
  • crates/vm/src/coroutine.rs
  • crates/vm/src/frame.rs
  • crates/vm/src/stdlib/_thread.rs
  • crates/vm/src/stdlib/sys/monitoring.rs
  • scripts/dis_dump.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • crates/compiler-core/src/bytecode/instruction.rs

@youknowone youknowone marked this pull request as draft April 11, 2026 16:39
@youknowone youknowone marked this pull request as ready for review April 12, 2026 05:03
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
crates/codegen/src/ir.rs (2)

3324-3326: ⚠️ Potential issue | 🟠 Major

Treat cleanup-only unwind blocks as exceptional.

block_is_exceptional() still returns false for POP_EXCEPT ... RERAISE cleanup blocks, so inline_small_or_no_lineno_blocks() and the conditional reorder passes can still move exception-cleanup bytecode into normal flow.

Suggested fix
 fn block_is_exceptional(block: &Block) -> bool {
-    block.except_handler || block.preserve_lasti
+    block.except_handler || block.preserve_lasti || is_exception_cleanup_block(block)
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/codegen/src/ir.rs` around lines 3324 - 3326, block_is_exceptional
currently only checks Block.except_handler and Block.preserve_lasti, so
cleanup-only unwind blocks (POP_EXCEPT ... RERAISE) are not treated as
exceptional; update fn block_is_exceptional(block: &Block) -> bool to also
return true when the block represents a cleanup unwind (e.g. check a field like
block.cleanup_unwind or block.is_cleanup_unwind), ensuring such blocks are
considered exceptional by inline_small_or_no_lineno_blocks() and the conditional
reorder passes.

3776-3778: ⚠️ Potential issue | 🟡 Minor

Don’t count old_next as a predecessor here.

This cloned block is jump-only, so it never falls through to old_next. Incrementing that count can make propagate_line_numbers() treat old_next as multi-predecessor and skip location propagation unnecessarily.

Suggested fix
         predecessors[target.idx()] -= 1;
         predecessors.push(1);
-        if old_next != BlockIdx::NULL {
-            predecessors[old_next.idx()] += 1;
-        }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/codegen/src/ir.rs` around lines 3776 - 3778, The code increments
predecessors[old_next.idx()] for cloned jump-only blocks, incorrectly marking
old_next as having an extra predecessor; remove the increment (the if block that
checks old_next != BlockIdx::NULL and does predecessors[old_next.idx()] += 1) so
cloned jump-only blocks do not contribute a predecessor count, which prevents
propagate_line_numbers() from misclassifying old_next as multi-predecessor.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@extra_tests/snippets/builtin_exec_thread_import.py`:
- Line 9: The exec() call is intentionally used and triggers S102; add an
explicit per-line ruff/flake8-bandit waiver by appending a noqa suppression to
the exec invocation (e.g., change the exec(code, scope, scope) line to include
"# noqa: S102") so the linter accepts this intentional use of exec() in
builtin_exec_thread_import.py.

---

Duplicate comments:
In `@crates/codegen/src/ir.rs`:
- Around line 3324-3326: block_is_exceptional currently only checks
Block.except_handler and Block.preserve_lasti, so cleanup-only unwind blocks
(POP_EXCEPT ... RERAISE) are not treated as exceptional; update fn
block_is_exceptional(block: &Block) -> bool to also return true when the block
represents a cleanup unwind (e.g. check a field like block.cleanup_unwind or
block.is_cleanup_unwind), ensuring such blocks are considered exceptional by
inline_small_or_no_lineno_blocks() and the conditional reorder passes.
- Around line 3776-3778: The code increments predecessors[old_next.idx()] for
cloned jump-only blocks, incorrectly marking old_next as having an extra
predecessor; remove the increment (the if block that checks old_next !=
BlockIdx::NULL and does predecessors[old_next.idx()] += 1) so cloned jump-only
blocks do not contribute a predecessor count, which prevents
propagate_line_numbers() from misclassifying old_next as multi-predecessor.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: eb8da607-9dc3-40e1-ad80-14b6bdb57b40

📥 Commits

Reviewing files that changed from the base of the PR and between 41e6ffe and 28b324d.

📒 Files selected for processing (10)
  • crates/codegen/src/compile.rs
  • crates/codegen/src/ir.rs
  • crates/vm/src/coroutine.rs
  • crates/vm/src/stdlib/_ast/python.rs
  • crates/vm/src/stdlib/_thread.rs
  • crates/vm/src/stdlib/os.rs
  • crates/vm/src/stdlib/sys/monitoring.rs
  • crates/vm/src/vm/mod.rs
  • extra_tests/snippets/builtin_exec_thread_import.py
  • extra_tests/snippets/thread_import_stack.py
✅ Files skipped from review due to trivial changes (1)
  • crates/codegen/src/compile.rs
🚧 Files skipped from review as they are similar to previous changes (2)
  • crates/vm/src/stdlib/sys/monitoring.rs
  • crates/vm/src/stdlib/_thread.rs

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
extra_tests/snippets/stdlib_threading.py (1)

23-23: ⚠️ Potential issue | 🟠 Major

Suppress intentional exec() to pass Ruff S102.

Line 23 intentionally uses exec, but it currently trips linting. Add an explicit inline waiver so CI does not fail.

Proposed fix
-        exec(code, scope, scope)
+        exec(code, scope, scope)  # noqa: S102 - intentional threaded exec regression test

As per coding guidelines: **/*.py: "Use ruff for linting Python code".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@extra_tests/snippets/stdlib_threading.py` at line 23, The exec call using
variables code and scope intentionally triggers S102; suppress it by adding an
inline Ruff waiver on the same line — change the exec(code, scope, scope)
invocation to include the noqa comment for S102 (e.g., append "  # noqa: S102")
so the intentional exec is allowed by CI while keeping behavior unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/vm/src/frame.rs`:
- Around line 1525-1534: The code currently skips creating a traceback entry
when frame.code.locations.get(idx) returns None; change this to fall back to a
previous mapped location or the function's first line instead of dropping the
frame. In the block that builds the traceback (where you call
frame.code.locations.get(idx), PyTraceback::new and
exception.set_traceback_typed), if get(idx) is None attempt
frame.code.locations.get(idx.saturating_sub(1)) and if that is also None use a
fallback like frame.code.first_line (or the first location's line), then
construct PyTraceback::new with that line and set_traceback_typed as before.
Apply the same fallback logic to the other analogous block (the one around the
second occurrence you noted) so lasti/f_lineno restoration still runs and the
frame always gets a traceback entry.

---

Duplicate comments:
In `@extra_tests/snippets/stdlib_threading.py`:
- Line 23: The exec call using variables code and scope intentionally triggers
S102; suppress it by adding an inline Ruff waiver on the same line — change the
exec(code, scope, scope) invocation to include the noqa comment for S102 (e.g.,
append "  # noqa: S102") so the intentional exec is allowed by CI while keeping
behavior unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 20afd674-e56e-42f3-aeb6-293e58ef6a31

📥 Commits

Reviewing files that changed from the base of the PR and between 28b324d and 9f76666.

📒 Files selected for processing (3)
  • crates/vm/src/frame.rs
  • crates/vm/src/stdlib/_thread.rs
  • extra_tests/snippets/stdlib_threading.py

Comment on lines +1525 to +1534
if let Some((loc, _end_loc)) = frame.code.locations.get(idx) {
let next = exception.__traceback__();
let new_traceback = PyTraceback::new(
next,
frame.object.to_owned(),
idx as u32 * 2,
loc.line,
);
exception.set_traceback_typed(Some(new_traceback.into_ref(&vm.ctx)));
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't drop the current frame when locations.get(idx) misses.

Avoiding the panic is good, but if this lookup returns None for the first raise in this frame, the exception now escapes without a traceback entry for the current frame at all. That also prevents the lasti restore at Line 1702 from running, so f_lineno can stay stale after cleanup. Please fall back to the previous mapped line / first line rather than skipping PyTraceback::new.

Also applies to: 1587-1603

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/vm/src/frame.rs` around lines 1525 - 1534, The code currently skips
creating a traceback entry when frame.code.locations.get(idx) returns None;
change this to fall back to a previous mapped location or the function's first
line instead of dropping the frame. In the block that builds the traceback
(where you call frame.code.locations.get(idx), PyTraceback::new and
exception.set_traceback_typed), if get(idx) is None attempt
frame.code.locations.get(idx.saturating_sub(1)) and if that is also None use a
fallback like frame.code.first_line (or the first location's line), then
construct PyTraceback::new with that line and set_traceback_typed as before.
Apply the same fallback logic to the other analogous block (the one around the
second occurrence you noted) so lasti/f_lineno restoration still runs and the
frame always gets a traceback entry.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
extra_tests/snippets/stdlib_threading.py (1)

13-13: Use bounded join() to keep snippet tests fail-fast.

Lines 13 and 28 use unbounded joins. If an import/threading deadlock regresses, this can stall the snippets suite until global CI timeout.

Proposed fix
-    thread.join()
+    thread.join(timeout=5)
+    assert not thread.is_alive(), "thread did not finish in time"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@extra_tests/snippets/stdlib_threading.py` at line 13, The snippet uses
unbounded thread.join() calls (thread.join()) which can hang CI; replace both
occurrences with a bounded join like thread.join(timeout=5) and immediately
check thread.is_alive(), raising or failing the test if it remains alive (e.g.,
if thread.is_alive(): raise RuntimeError("thread did not terminate within
timeout")). Update both calls of thread.join() in stdlib_threading.py
accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@extra_tests/snippets/stdlib_threading.py`:
- Around line 7-15: The worker thread can raise during import/exec and
Thread.join() won't surface that exception, causing a misleading KeyError on the
assertion; modify the worker (function named worker used when creating
threading.Thread) to catch any Exception and store it into the shared outcome
dict (e.g., outcome["exc"] = sys.exc_info() or outcome["exc"] =
traceback.format_exc()) alongside outcome["name"], then after thread.join() in
the main flow check outcome for an exception and re-raise or fail with the
stored exception details before performing the assert on outcome["name"]; apply
the same pattern to the other similar block that uses threading.Thread/join
covering lines 21-29.

---

Nitpick comments:
In `@extra_tests/snippets/stdlib_threading.py`:
- Line 13: The snippet uses unbounded thread.join() calls (thread.join()) which
can hang CI; replace both occurrences with a bounded join like
thread.join(timeout=5) and immediately check thread.is_alive(), raising or
failing the test if it remains alive (e.g., if thread.is_alive(): raise
RuntimeError("thread did not terminate within timeout")). Update both calls of
thread.join() in stdlib_threading.py accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 417e6fe1-ae21-4620-87c5-231da02f6fa7

📥 Commits

Reviewing files that changed from the base of the PR and between 9f76666 and c56f9fc.

📒 Files selected for processing (3)
  • crates/codegen/src/ir.rs
  • crates/vm/src/stdlib/_thread.rs
  • extra_tests/snippets/stdlib_threading.py
✅ Files skipped from review due to trivial changes (1)
  • crates/codegen/src/ir.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • crates/vm/src/stdlib/_thread.rs

Comment on lines +7 to +15
def worker():
module = __import__(module_name, fromlist=["*"])
outcome["name"] = module.__name__

thread = threading.Thread(target=worker)
thread.start()
thread.join()

assert outcome["name"] == module_name
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

In Python threading, does Thread.join() propagate exceptions raised in the target function, and what is the recommended way to surface worker exceptions to the joining thread?

💡 Result:

No, Thread.join does not propagate exceptions raised in the target function. Uncaught exceptions in a thread's run method are handled by threading.excepthook, which by default prints to stderr and terminates the thread silently from the joining thread's perspective. The recommended way to surface worker exceptions to the joining thread is to use concurrent.futures.ThreadPoolExecutor. Submit tasks to get Future objects; calling future.result re-raises any exception from the worker in the main thread. For raw threading.Thread, common patterns include: - Subclass Thread to store exc_info in run and re-raise in overridden join. - Use a Queue to pass exceptions from worker to main. Official docs confirm Thread.join only waits for termination without raising (Python 3.14 threading docs), while concurrent.futures explicitly propagates via Future.result (concurrent.futures docs). Example with ThreadPoolExecutor: import concurrent.futures def worker: raise ValueError("Error") with concurrent.futures.ThreadPoolExecutor as executor: future = executor.submit(worker) try: future.result except ValueError as e: print(e) # Propagates Custom Thread subclass example: import threading import sys class ExcThread(threading.Thread): def run(self): self.exc = None try: super.run except Exception: self.exc = sys.exc_info def join(self, timeout=None): super.join(timeout) if self.exc: raise self.exc[1].with_traceback(self.exc[2])

Citations:


Worker exceptions are silently swallowed, masking failures as KeyError at assertion time.

Thread.join() does not propagate exceptions raised in the target function. When the worker thread raises during import or exec, the main thread sees only KeyError at line 15 or 29, hiding the real failure and making regressions harder to diagnose.

Proposed fix
 def import_in_thread(module_name):
     outcome = {}
+    error = {}

     def worker():
-        module = __import__(module_name, fromlist=["*"])
-        outcome["name"] = module.__name__
+        try:
+            module = __import__(module_name, fromlist=["*"])
+            outcome["name"] = module.__name__
+        except BaseException as exc:
+            error["exc"] = exc

     thread = threading.Thread(target=worker)
     thread.start()
     thread.join()
+    if "exc" in error:
+        raise error["exc"]

     assert outcome["name"] == module_name


 def run_exec(code):
     result = {}
+    error = {}

     def worker():
-        scope = {"__builtins__": __builtins__}
-        exec(code, scope, scope)  # noqa: S102 - intentional threaded exec regression test
-        result["scope"] = scope
+        try:
+            scope = {"__builtins__": __builtins__}
+            exec(code, scope, scope)  # noqa: S102 - intentional threaded exec regression test
+            result["scope"] = scope
+        except BaseException as exc:
+            error["exc"] = exc

     thread = threading.Thread(target=worker)
     thread.start()
     thread.join()
+    if "exc" in error:
+        raise error["exc"]
     return result["scope"]

Also applies to: lines 21-29

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@extra_tests/snippets/stdlib_threading.py` around lines 7 - 15, The worker
thread can raise during import/exec and Thread.join() won't surface that
exception, causing a misleading KeyError on the assertion; modify the worker
(function named worker used when creating threading.Thread) to catch any
Exception and store it into the shared outcome dict (e.g., outcome["exc"] =
sys.exc_info() or outcome["exc"] = traceback.format_exc()) alongside
outcome["name"], then after thread.join() in the main flow check outcome for an
exception and re-raise or fail with the stored exception details before
performing the assert on outcome["name"]; apply the same pattern to the other
similar block that uses threading.Thread/join covering lines 21-29.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants