[IR Container] Phase 2.5 Copy-Move Semantics by mdavis36 · Pull Request #5964 · NVIDIA/Fuser

mdavis36 · 2026-02-12T23:31:36Z

Summary

Implement shared-container-aware copy, move, and swap operations, plus per-Fusion name counters that ensure cloned Vals get matching names. This PR combines the originally planned Tasks 3 and 4 — per-Fusion name counters were required to fix CI failures from the copy implementation (553 failures from duplicate TV names when name counter synchronization was missing).

Changes

Copy semantics:

Copy constructor & assignment op: Share container pointer via shared_ptr, register with container, delegate to Fusion::copy
Fusion::copy: Clear destination, create IrCloner targeting dest, clone source's deterministic_vals into shared container, clone Fusion-level state (inputs, outputs, axioms, metadata)

Move semantics:

Move constructor: Create empty Fusion, swap with source
Move assignment: Clear, swap

Swap:

Ownership-filtered pointer swap handling three distinct cases:
1. Two Fusions with different containers
2. Two Fusions sharing the same container
3. Swap with third-party Fusions sharing a container

Per-Fusion name counters:

val_type_name_map_ and expr_name_counter_ added as Fusion members
getValName(ValType) and getExprName() methods on Fusion
Counter lifecycle: sync in copy, swap in swap, reset in clear

Copy Semantics in Detail

BEFORE:
  Fusion A ──→ shared_ptr<Container C> ──→ {val_0(A), val_1(A), expr_0(A)}
  Container C: sharing_fusions_ = {A}

COPY: Fusion B(A)   // copy constructor

AFTER:
  Fusion A ─┐
             ├──→ shared_ptr<Container C> ──→ {val_0(A), val_1(A), expr_0(A),
  Fusion B ─┘                                  val_0'(B), val_1'(B), expr_0'(B)}
  Container C: sharing_fusions_ = {A, B}

  // B's clones have matching names: val_0'->name() == val_0->name()
  // IR graphs are independent: modifying B's clone doesn't affect A

The copy constructor shares the container (increments shared_ptr refcount), then clones A's nodes into the same shared storage. Per-Fusion tracking ensures each Fusion's accessors still return only their own nodes.

Swap: Three Cases

Case 1: Different containers
  BEFORE:  A ──→ C1 ──→ {val_0(A)}     B ──→ C2 ──→ {val_1(B)}
  AFTER:   A ──→ C2 ──→ {val_1(A)}     B ──→ C1 ──→ {val_0(B)}
  Statement pointers updated: val_0→B, val_1→A

Case 2: Same container
  BEFORE:  A ─┐                         B ─┐
              ├──→ C ──→ {val_0(A), val_1(B)}
  AFTER:   A ─┐                         B ─┐
              ├──→ C ──→ {val_0(B), val_1(A)}
  Container pointer swap is a no-op; ownership flips.

Case 3: Third-party sharing
  BEFORE:  A ─┐
              ├──→ C1 ──→ {val_0(A), val_2(X)}     B ──→ C2 ──→ {val_1(B)}
         X ─┘
  AFTER:   A ──→ C2 ──→ {val_1(A)}
          B ─┐
              ├──→ C1 ──→ {val_0(B), val_2(X)}
         X ─┘
  Critical: X's statements are NEVER modified.

Why Name Counters Were Merged Into This PR

The initial implementation of Fusion::copy replaced the old IrContainer::copy with direct IrCloner-based cloning but dropped name counter synchronization. Without per-Fusion counters, cloned Vals in a shared container received names starting past the source's last name (e.g., T10–T19 instead of T0–T9), breaking alias_memory.cpp (duplicate tv->name() assertions) and cascading into 553 CI failures across codegen, validation, and numerical checks.

The fix — per-Fusion name counters as Fusion members — is architecturally cleaner than the originally planned IrContainer-level maps, avoids indirection, and aligns with the per-Fusion state model established in earlier tasks.

Relationship to Phase 2

Copy/move/swap are the operations that make shared containers usable. Without them, the shared_ptr and tracking infrastructure from PRs 1–2 are inert. This PR enables the core Phase 2 scenario:

SegmentedFusion::makeFusion (Phase 2 — separate containers):
  auto fusion_segment = make_unique<Fusion>();     // New container
  Fusion::copy(completeFusion(), fusion_segment);  // Clone into separate container

SegmentedFusion::makeFusion (Phase 3 — shared containers):
  auto fusion_segment = make_unique<Fusion>(*completeFusion());  // Copy ctor → shared!
  // Scalars reused, non-scalars cloned into shared container

Phase 2 establishes the copy/move/swap mechanics. Phase 3 simply changes makeFusion from default-ctor + Fusion::copy to copy-ctor (shared container), and the infrastructure from this PR handles everything correctly.

Per-Fusion name counters are critical for cross-clone name correspondence required by GreedyParams::at(tv->name()) and normalization_utils — both of which look up Vals by name as a map key across clone boundaries.

CI Risk

Medium. Copy/move/swap are well-defined operations with clear semantics. The 553-failure CI regression from missing name counters was identified and fixed before merge.

mdavis36 · 2026-02-12T23:31:46Z

!test

github-actions · 2026-02-12T23:32:29Z

Review updated until commit d145d9e

Description

Implement shared-container-aware copy/move/swap operations where copy constructor shares source's container pointer instead of creating new one
Add per-Fusion name counters (val_type_name_map_, expr_name_counter_) to fix duplicate TV names after copy when source names are non-sequential
Rewrite Fusion::swap to use pointer-based swap with ownership tracking for same-container, different-container, and third-party cases
Update registerVal/registerExpr to use Fusion-level name counters instead of container-level ones

Changes walkthrough

Relevant files

Enhancement

fusion.cpp `Implement copy-move-swap with shared containers` csrc/fusion.cpp Copy constructor now shares source's container pointer via `shared_ptr` and registers with container `Fusion::copy` clones from `deterministic_vals()` directly instead of delegating to `IrContainer::copy` Sync name counters from source to dest after cloning in `Fusion::copy` Rewrite `Fusion::swap` to collect owned vals/exprs before swap, handle same-container vs different-container cases, swap container pointers and all Fusion-level members Move constructor creates empty Fusion then swaps; move assignment clears then swaps Add self-assignment guards in copy/move assignment operators Remove `noexcept` from swap and move operations (can allocate vectors) Clear name counters in `Fusion::clear()`	+121/-51
fusion.h `Add per-Fusion name counter members and methods` csrc/fusion.h Add `val_type_name_map_` (unordered_map) and `expr_name_counter_` as Fusion members Add `getValName(ValType)` and `getExprName()` methods to Fusion Remove `noexcept` from move constructor and move assignment declarations Update `swap` declaration to remove `noexcept`	+18/-3

PR Reviewer Guide

Here are some key observations to aid the review process:

🧪 PR contains tests

⚡ Recommended focus areas for review

Complex swap logic

The Fusion::swap function handles three distinct cases: different containers, same container, and third-party sharing a container. While the logic appears sound, the complexity increases the risk of edge case bugs. The per-Fusion tracking key swapping at lines 185-191 and ownership transfer at lines 189-190 are particularly intricate. Consider adding more comprehensive unit tests for swap operations.

void Fusion::swap(Fusion& a, Fusion& b) {
  FUSER_PERF_SCOPE("Fusion swap");

  if (&a == &b) {
    return;
  }

  NVF_ERROR(
      a.ir_container_ != nullptr, "Fusion::swap: a has null ir_container_");
  NVF_ERROR(
      b.ir_container_ != nullptr, "Fusion::swap: b has null ir_container_");

  // Collect statements owned by each Fusion BEFORE swap so we can update
  // Statement::ir_container_ pointers afterward.
  std::vector<Val*> a_owned_vals, b_owned_vals;
  std::vector<Expr*> a_owned_exprs, b_owned_exprs;

  const auto& av = a.ir_container_->valsOwnedBy(&a);
  const auto& ae = a.ir_container_->exprsOwnedBy(&a);
  a_owned_vals.assign(av.begin(), av.end());
  a_owned_exprs.assign(ae.begin(), ae.end());

  const auto& bv = b.ir_container_->valsOwnedBy(&b);
  const auto& be = b.ir_container_->exprsOwnedBy(&b);
  b_owned_vals.assign(bv.begin(), bv.end());
  b_owned_exprs.assign(be.begin(), be.end());

  // Transfer Fusion registrations between containers before pointer swap.
  // After swap, a will own b's container and b will own a's container.
  if (a.ir_container_.get() != b.ir_container_.get()) {
    a.ir_container_->transferFusion(&a, &b);
    b.ir_container_->transferFusion(&b, &a);
  }

  // Swap container pointers
  std::swap(a.ir_container_, b.ir_container_);

  // Swap all Fusion-level members
  std::swap(a.inputs_, b.inputs_);
  std::swap(a.outputs_, b.outputs_);
  std::swap(a.io_alias_, b.io_alias_);
  std::swap(a.all_tv_uses_valid_, b.all_tv_uses_valid_);
  std::swap(a.is_during_update_uses_, b.is_during_update_uses_);
  std::swap(a.managed_data_, b.managed_data_);
  std::swap(a.managed_named_data_, b.managed_named_data_);
  std::swap(a.expected_dynamic_smem_bytes_, b.expected_dynamic_smem_bytes_);
  std::swap(a.all_tvs_ptr_, b.all_tvs_ptr_);
  std::swap(a.zero_val_, b.zero_val_);
  std::swap(a.one_val_, b.one_val_);
  std::swap(a.true_val_, b.true_val_);
  std::swap(a.false_val_, b.false_val_);
  std::swap(a.magic_zero_val_, b.magic_zero_val_);
  std::swap(a.axioms_, b.axioms_);
  std::swap(a.metadata_, b.metadata_);
  std::swap(a.val_type_name_map_, b.val_type_name_map_);
  std::swap(a.expr_name_counter_, b.expr_name_counter_);

  // Update Statement::ir_container_ pointers: a's old statements now belong
  // to b, and b's old statements now belong to a
  for (auto* val : a_owned_vals) {
    val->ir_container_ = &b;
  }
  for (auto* expr : a_owned_exprs) {
    expr->ir_container_ = &b;
  }
  for (auto* val : b_owned_vals) {
    val->ir_container_ = &a;
  }
  for (auto* expr : b_owned_exprs) {
    expr->ir_container_ = &a;
  }

  // Update per-Fusion tracking keys in containers. At this point, both
  // a and b are guaranteed to have non-null ir_container_ (verified above).
  if (a.ir_container_.get() == b.ir_container_.get()) {
    // Same container: directly swap per-Fusion tracking entries
    auto* c = a.ir_container_.get();
    std::swap(c->per_fusion_vals_[&a], c->per_fusion_vals_[&b]);
    std::swap(c->per_fusion_exprs_[&a], c->per_fusion_exprs_[&b]);
  } else {
    // Different containers: rename tracking keys to match new owners
    a.ir_container_->transferStatementOwnership(&b, &a);
    b.ir_container_->transferStatementOwnership(&a, &b);
  }
}

Container sharing semantics

The copy constructor shares the source's container via ir_container_ = other.ir_container_. This is intentional per the PR goals (shared-container-aware copy), but consumers need to be aware that modifications to one Fusion can affect the shared container state. Ensure this behavior is well-documented for API users.

Fusion::Fusion(const Fusion& other) : ir_container_(other.ir_container_) {
  FUSER_PERF_SCOPE("Fusion copy");
  ir_container_->addFusion(this);
  Fusion::copy(&other, this);
}

Non-noexcept move operations

Move constructor (line 329) and move assignment (line 347) are not marked noexcept with explicit justification. This could impact performance when Fusions are moved into standard library containers. Verify this trade-off is acceptable for expected use cases, as the comments indicate.

// Not marked noexcept: Fusion::swap allocates local std::vectors to collect
// statement ownership before the swap, which can throw. Since Fusions are not
// expected to be moved into containers, the performance trade-off is
// acceptable.
// NOLINTNEXTLINE(cppcoreguidelines-noexcept-move-operations)
Fusion::Fusion(Fusion&& other) : Fusion() {
  FUSER_PERF_SCOPE("Fusion move");
  swap(*this, other);
}

// Copy Assignment -- shares the source's container
Fusion& Fusion::operator=(const Fusion& other) {
  FUSER_PERF_SCOPE("Fusion copy assign");
  if (this != &other) {
    Fusion copy(other);
    clear();
    swap(*this, copy);
  }
  return *this;
}

// Not marked noexcept: See move constructor above.
// NOLINTNEXTLINE(cppcoreguidelines-noexcept-move-operations)
Fusion& Fusion::operator=(Fusion&& other) {
  FUSER_PERF_SCOPE("Fusion move assign");
  if (this != &other) {
    clear();
    swap(*this, other);
  }
  return *this;
}

mdavis36 · 2026-02-18T00:42:35Z

!test

mdavis36 · 2026-02-18T06:37:48Z

!test

greptile-apps · 2026-02-18T06:43:20Z

Greptile Summary

This PR implements shared-container-aware copy, move, and swap semantics for Fusion, along with per-Fusion name counters (val_type_name_map_ / expr_name_counter_) that ensure cloned Vals receive matching names regardless of container ownership. The three-case swap logic (different containers, same container, third-party sharing) is well-designed and correctly handles sharing_fusions_ bookkeeping via transferFusion and transferStatementOwnership.

The one remaining concern: the copy assignment operator calls clear() explicitly before swap, which violates the standard copy-and-swap idiom's exception guarantee. If swap throws (due to std::bad_alloc in its vector allocations), *this is left permanently cleared rather than preserving its original state. Removing the explicit clear() call restores proper exception safety.

Confidence Score: 3/5

Safe to merge with awareness of the copy-assignment exception safety gap; all other semantics are correctly implemented.
The three-case swap logic, counter synchronization, and shared-container registration are all correctly implemented and well-tested (553 CI failures were caught and fixed before merge). The score is reduced from 5 because the copy assignment operator's explicit clear() before swap violates the basic exception guarantee: if swap throws (due to std::bad_alloc in its local vector allocations), *this is left permanently cleared rather than preserving its original state. This is a real, if low-probability, correctness risk for callers that catch exceptions.
csrc/fusion.cpp — specifically operator=(const Fusion&) at line 335, where the explicit clear() before swap should be removed to restore the standard idiom's exception safety.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant FusionB as Fusion B (copy ctor)
    participant Container as shared IrContainer C
    participant IrCloner
    participant FusionA as Fusion A (source)

    Caller->>FusionB: Fusion B(A)  [copy constructor]
    FusionB->>Container: share A.ir_container_ (shared_ptr copy)
    FusionB->>Container: addFusion(&B)
    FusionB->>FusionB: Fusion::copy(&A, &B)
    FusionB->>FusionB: clear() → removeStatementsOwnedBy(&B) [no-op]
    FusionB->>IrCloner: IrCloner ir_cloner(B)
    loop clone vals in insertion order
        IrCloner->>Container: clone(val_i) → registerVal(&B) → getValName(vtype)
        IrCloner->>IrCloner: setName(src->name()) overrides counter
    end
    loop wire definitions/uses
        IrCloner->>Container: clone(expr_j) → registerExpr(&B) → getExprName()
        IrCloner->>FusionB: setDefinition / setUses
    end
    FusionB->>FusionB: sync val_type_name_map_ = A.val_type_name_map_
    FusionB->>FusionB: sync expr_name_counter_ = A.expr_name_counter_
    FusionB->>FusionB: remap inputs_, outputs_, io_alias_, axioms_, metadata_

    note over Container: C now holds A's vals (keyed &A) AND B's clones (keyed &B)
    note over FusionB: B.val_type_name_map_ matches A → new TVs start at max(name)+1

_{Last reviewed commit: c8ffe1d}

greptile-apps

_{4 files reviewed, 3 comments}

_{Edit Code Review Agent Settings | Greptile}

csrc/fusion.cpp

csrc/fusion.h

mdavis36 · 2026-02-18T16:02:09Z

!test

greptile-apps · 2026-03-03T01:19:53Z

Additional Comments (1)

csrc/fusion.cpp, line 221
expr_name_counter_ sync placed before expr cloning

The counter sync at lines 220–221 sets to->expr_name_counter_ = from->expr_name_counter_ before any exprs have been cloned. Expr cloning happens in the immediately following def/uses loop (lines 224–227): each ir_cloner.clone(val->definition_) call creates a cloned Expr and triggers registerExpr, which increments to->expr_name_counter_. As a result, after Fusion::copy completes, to->expr_name_counter_ ends up at from->expr_name_counter_ + N (where N is the number of exprs in from), rather than the intended from->expr_name_counter_.

Compare with val counters: the val counter sync is correctly placed after the val-cloning loop so that registerVal calls during cloning don't pollute the final counter state. The same logic needs to apply to expr_name_counter_.

While this doesn't cause immediate name collisions (the overcounted range is unused by any actual expr), it creates unnecessary name-space gaps and violates the invariant that a cloned Fusion's counter should match the source's counter — which is the premise relied on by GreedyParams::at(tv->name()) and similar consumers.

  // Wire up definitions and uses on cloned vals
  for (auto val : from->vals()) {
    ir_cloner.clone(val)->setDefinition(ir_cloner.clone(val->definition_));
    ir_cloner.clone(val)->setUses(ir_cloner.clone(val->uses_));
  }

  // Sync per-Fusion name counters from source to dest.
  // This must happen AFTER all cloning (both vals and exprs) so that the
  // temporary sequential names assigned by registerVal/registerExpr during
  // cloning do not inflate the counters past the source's values.
  to->val_type_name_map_ = from->val_type_name_map_;
  to->expr_name_counter_ = from->expr_name_counter_;

mdavis36 · 2026-03-03T02:35:17Z

!test

greptile-apps · 2026-03-03T03:00:04Z

Additional Comments (2)

csrc/fusion.cpp, line 212
Non-deterministic expr clone ordering in second loop

The second loop uses from->vals() — an unordered_set — to drive lazy expr cloning via ir_cloner.clone(val->definition_). Because hash-map iteration order is non-deterministic across runs, the order in which exprs are first encountered (and thus registered into exprs_up_) is unpredictable.

The old IrContainer::copy explicitly cloned exprs in deterministic_exprs() order:

// Copy expressions in deterministic order
for (auto expr : from->deterministic_exprs()) {
    to->exprs_.insert(ir_cloner.clone(expr));
}

With the new code, the destination fusion's deterministic_exprs() may return exprs in a different insertion-order than from->deterministic_exprs(), since exprs_up_ insertion order is driven by whichever val happens to trigger each expr's first clone. Any downstream consumer of deterministic_exprs() that expects a stable ordering (e.g. serialisation, printing, diff-based testing) may break non-deterministically.

The first loop already uses from->deterministic_vals() for reproducibility — the second loop should match:

  for (auto val : from->deterministic_vals()) {
    ir_cloner.clone(val)->setDefinition(ir_cloner.clone(val->definition_));
    ir_cloner.clone(val)->setUses(ir_cloner.clone(val->uses_));
  }

csrc/fusion.cpp, line 324
noexcept move constructor delegates to non-noexcept swap

The move constructor is declared noexcept (both in fusion.h line 183 and in the definition here), but Fusion::swap was changed in this PR from noexcept to a regular (potentially-throwing) function. swap now performs multiple allocating operations (vector construction, unordered_map::operator[], unordered_set::insert) that can throw std::bad_alloc.

Calling a potentially-throwing function from a noexcept context means any exception will immediately invoke std::terminate rather than propagate to the caller. The same issue applies to operator=(Fusion&&) on line 334 which is also noexcept and calls swap.

The previous discussion noted that swap was problematic when it was noexcept — but now that noexcept has been removed from swap, the move operations need to be updated consistently:

// In fusion.h:
Fusion(Fusion&& other);                        // remove noexcept
Fusion& operator=(Fusion&& other);             // remove noexcept

// In fusion.cpp:
Fusion::Fusion(Fusion&& other) : Fusion() {   // remove noexcept
  FUSER_PERF_SCOPE("Fusion move");
  swap(*this, other);
}

Fusion& Fusion::operator=(Fusion&& other) {   // remove noexcept
  ...
}

Note: removing noexcept from move operations does affect STL container optimisations (e.g. std::vector will copy instead of move if the move constructor is not noexcept). That may be an acceptable trade-off given Fusion objects are not typically stored in vectors, but worth being explicit about.

greptile-apps · 2026-03-03T03:19:18Z

Additional Comments (3)

csrc/fusion.cpp, line 333
Copy assignment silently inherits shared container

The copy assignment operator now produces a destination Fusion that shares other's IrContainer, whereas the old code left the destination with its own independent container. This is an indirect consequence of the copy constructor change: Fusion copy(other) now calls the new copy constructor (which does ir_container_(other.ir_container_)), and after the swap, *this ends up owning that shared container.

Consider the following scenario:

Fusion a;
// populate a...
Fusion b;
b = a;
// b now shares a's IrContainer — NOT isolated!
a.clear();  // also removes statements from the container b depends on

Before this PR, b = a gave b a fully independent container. After this PR, b shares a's container. The PR description explicitly documents this only for the copy constructor; the copy assignment operator is not mentioned. If this is intentional, a comment documenting the shared-container semantics of operator= would prevent future surprises.

csrc/fusion.cpp, line 184
Same-container swap inserts spurious empty entries via operator[]

In the same-container path, operator[] on per_fusion_vals_ and per_fusion_exprs_ default-constructs an empty unordered_set when a key doesn't exist (e.g., one of the Fusions has never registered any vals):

std::swap(c->per_fusion_vals_[&a], c->per_fusion_vals_[&b]);
std::swap(c->per_fusion_exprs_[&a], c->per_fusion_exprs_[&b]);

If &a or &b had no registered vals/exprs, a new empty-set entry is inserted for that key. All current consumers (valsOwnedBy, deterministicValsOwnedBy, etc.) handle empty sets correctly, so there's no observable correctness issue. However, the spurious entries accumulate over time and could mislead future code that inspects the keys of per_fusion_vals_ to determine which Fusions are registered.

Consider using find first and only inserting/swapping entries when they actually exist, mirroring the defensive style in transferStatementOwnership.

csrc/fusion.cpp, line 213
Name override in IrBuilder::clone is skipped when src_container == dest_container (raw pointer comparison)

In IrBuilder::clone (fusion.h line 731):

if (src_container != dest_container) {
    dest_stmt->setName(IrBuilderPasskey(dest_container), src_stmt->name());
}

src_container is src_stmt->container() — the Fusion* that owns the source val — and dest_container is ir_cloner->container() — the destination Fusion*. In Fusion::copy, the IrCloner is constructed as IrCloner ir_cloner(to), so dest_container = to. Because from != to as Fusion objects (even when they share the same IrContainer), the condition is true and setName is correctly invoked, preserving name matching.

However, this invariant is fragile: it relies on the fact that Statement::ir_container_ stores a Fusion* (not the raw IrContainer*). If any path sets val->ir_container_ to the shared IrContainer* instead of the owning Fusion*, names would silently stop matching without any assertion failure.

A targeted NVF_CHECK or static assertion confirming that src_stmt->container() returns the owning Fusion (not the shared IrContainer) would make this assumption explicit and catch regressions.

mdavis36 · 2026-03-03T03:41:31Z

!test

greptile-apps · 2026-03-03T19:34:32Z

Additional Comments (1)

csrc/fusion.cpp, line 334
Missing self-assignment guard in copy assignment operator

The copy assignment operator is missing the self-assignment check that the move assignment operator already has (if (this != &other) at line 338). Without it, f = f is destructive:

Fusion copy(f) — creates a copy sharing f's container; clones all vals into the same shared container under a temporary Fusion key
clear() — deletes f's original vals from the shared container (physical unique_ptr in vals_up_ is released)
swap(*this, copy) — f receives the clones; original pointers are gone

Any external Val* / Expr* raw pointers held by callers (e.g., cached scheduler state, Statement::ir_container_ back-pointers) now point to freed memory. This mirrors the protection already in the move assignment operator.

Fusion& Fusion::operator=(const Fusion& other) {
  FUSER_PERF_SCOPE("Fusion copy assign");
  if (this == &other) {
    return *this;
  }
  Fusion copy(other);
  clear();
  swap(*this, copy);
  return *this;
}

mdavis36 · 2026-03-03T20:18:06Z

!test

greptile-apps · 2026-03-03T20:26:53Z

Additional Comments (3)

csrc/fusion.cpp, line 184
Spurious empty-entry insertion via operator[] in same-container swap

std::unordered_map::operator[] default-inserts a value (an empty unordered_set) when the key is absent. In the same-container swap path, if one of the Fusions was previously clear()-ed its entry will have been removed from per_fusion_vals_ by removeStatementsOwnedBy, so the operator[] call creates a new, empty set for it. After the std::swap, the formerly-cleared Fusion ends up with an explicit empty entry instead of no entry at all.

While functionally harmless today, this inconsistency can confuse future callers of valsOwnedBy() that distinguish between "has an entry with an empty set" and "has no entry". Consider using find+insert to avoid the implicit insertion:

auto& a_vals = c->per_fusion_vals_;
auto& a_exprs = c->per_fusion_exprs_;
auto it_av = a_vals.find(&a), it_bv = a_vals.find(&b);
auto it_ae = a_exprs.find(&a), it_be = a_exprs.find(&b);
if (it_av != a_vals.end() || it_bv != a_vals.end())
  std::swap(a_vals[&a], a_vals[&b]);
// same for exprs

csrc/fusion.cpp, line 213
Fusion::copy no longer calls IrContainer::copy, leaving it as dead code

Fusion::copy previously delegated to IrContainer::copy (the old first line was auto ir_cloner = IrContainer::copy(from->ir_container(), to->ir_container(), to)). This PR replaces that with an inline clone loop, which is correct — but it leaves IrContainer::copy (defined in container.cpp lines 88–114) with zero callers. Since IrContainer's copy/move constructors/operators are explicitly deleted, IrContainer::copy is now unreachable protected dead code.

Similarly, IrContainer::swap (container.cpp lines 71–86) was previously called by the old Fusion::swap but the new Fusion::swap does not call it — leaving it as an additional dead method.

Most importantly, both dead methods still manipulate the container-level name counters (val_type_name_map_ and expr_name_counter_ on IrContainer). Since Fusion::registerVal and Fusion::registerExpr now call Fusion::getValName/getExprName (the per-Fusion counters introduced in this PR) instead of the container-level equivalents, the container-level counters are never incremented in the new flow. They are permanently empty/zero, making IrContainer::getValName() and IrContainer::getExprName() return incorrect values if ever called in the future.

Consider removing IrContainer::copy, IrContainer::swap, IrContainer::getValName, IrContainer::getExprName, and the fields val_type_name_map_/expr_name_counter_ from IrContainer as part of this PR to prevent future confusion and potential misuse.

csrc/fusion.cpp, line 335
Copy-assignment leaves this sharing other's container even when this had an independent container

The sequence:

Fusion copy(other);   // copy ctor → copy now shares other->ir_container_
clear();              // clears this's statements from its original container
swap(*this, copy);    // this ends up sharing other's container

After this operation *this permanently shares other's IrContainer. That is by design per the PR description ("Copy semantics: share container pointer via shared_ptr"), but it creates a subtle lifetime asymmetry: this's original container is now held only by the temporary copy. When copy destructs, it calls copy.~Fusion() → removeFusion(&copy) → the original container's refcount drops to zero and it is destroyed.

This is correct for the current ownership model, but it means that after a = b, a and b now share a container. A subsequent a.clear() will call removeStatementsOwnedBy(&a) on the shared container — and only a's statements are removed, not b's. That is the intended semantics, but it may surprise callers who expect copy-assignment to be self-contained (e.g., existing code that calls Fusion::copy then makes independent modifications to both fusions would now inadvertently share storage). A comment on the declaration in fusion.h (near the copy-assignment signature) calling out that the assignment shares the source's container would prevent confusion.

greptile-apps · 2026-03-04T23:17:15Z

Additional Comments (3)

csrc/ir/container.cpp, line 243
removeStatementsOwnedBy doesn't clean up use-def edges before destroying statements

When a Fusion is cleared in a shared container, the destroyed Exprs are not removed from their input Vals' uses_ lists before the unique pointers are released. If an Expr owned by Fusion A takes a Val owned by Fusion B as input, destroying Fusion A's Expr leaves B's Val::uses_ with a dangling pointer to the destroyed expression.

Compare with removeStatementsCreatedAfter, which explicitly cleans up uses before destruction:

for (Val* in : e->inputs()) {
  in->removeUse(e);
}

This is not a risk for today's independent-clone graphs (Phase 2 copy semantics keep IR graphs separate), but the PR description explicitly calls out Phase 3 scalar reuse where Vals in one Fusion ARE referenced by Exprs in another Fusion. At that point, clearing a Fusion via Fusion::clear() → removeStatementsOwnedBy will corrupt the surviving Fusion's IR graph with dangling Expr* pointers in Val::uses_.

Consider at minimum adding a comment warning about this limitation, or adding edge cleanup before the erase_if loops:

// For each expr owned by fusion, remove it from its inputs' uses_ lists
// to avoid dangling pointers in other Fusions' Vals
if (exprs_it != per_fusion_exprs_.end()) {
  for (Expr* e : exprs_it->second) {
    for (Val* inp : e->inputs()) {
      inp->removeUse(e);
    }
  }
}
// ... then erase_if loop

csrc/fusion.cpp, line 504
LIFO invariant is fragile in a shared-container setting

removeStatementsCreatedAfter pops from the back of the shared exprs_up_ and vals_up_ deques under the assumption that this Fusion's most recently created statements are always at the tail ("LIFO invariant"). In a shared container where two Fusions can interleave statement creation, this invariant can silently break.

Concrete scenario: StatementGuard is entered for FusionA (records count N), then FusionA creates expr e5, then FusionB creates expr e0 into the same shared deque. The global deque tail is now e0 (owned by FusionB). When the guard destructs and calls removeStatementsCreatedAfter:

c->exprsOwnedBy(this) == N+1 > N  → enters while loop
c->exprs_up_.back() == e0          → belongs to FusionB
NVF_ERROR fires                    → crash

With kPhase2DisableParallelCompile = true this is currently safe, but the assertion is structurally fragile. Any future work that allows two Fusions sharing a container to be active simultaneously (e.g., Phase 3's SegmentedFusion) will hit this. A more robust approach would iterate exprs_up_ from the back and skip (or separately track) statements belonging to other Fusions, rather than asserting on ownership of the global tail.

csrc/runtime/fusion_kernel_runtime.cpp, line 30
Global compile-time constant silently disables parallel compilation for all fusions

kPhase2DisableParallelCompile = true is a compile-time constant, meaning parallel compilation is unconditionally disabled for every FusionKernelRuntime, regardless of whether the fusion actually uses shared containers. This is a hard-coded performance regression that affects all users of the runtime (not just Phase 2 shared-container paths) and cannot be toggled without a recompile.

A runtime-checkable condition would be preferable — for example, checking whether the Fusion's ir_container_ptr() has hasMultipleFusions() or a dedicated Phase2 flag:

if (num_groups == 1 ||
    fusion_->ir_container_ptr()->hasMultipleFusions() ||
    isOptionDisabled(DisableOption::ParallelCompile)) {

This would restore parallel compilation for non-shared-container fusions immediately, rather than waiting for the TODO to be addressed.

greptile-apps · 2026-03-04T23:35:05Z

Additional Comments (2)

csrc/fusion.cpp, line 201
Missing self-copy guard in Fusion::copy

Fusion::copy is a static public method. If a caller passes the same Fusion as both from and to (i.e. Fusion::copy(f, f)), line 201 calls to->clear(), which calls ir_container_->removeStatementsOwnedBy(to) and destroys all of from's vals. The subsequent iteration over from->deterministic_vals() then sees an empty container and produces a silently empty result — data loss without any error.

While the current internal call sites (copy constructor and copy assignment) never trigger this, the function is public and lacks a precondition assertion:

IrCloner Fusion::copy(const Fusion* from, Fusion* to) {
  NVF_ERROR(from != to, "Fusion::copy: self-copy is not allowed");
  to->clear();

csrc/fusion.cpp, line 343
Exception safety hole: clear() before potentially-throwing swap

The copy-assignment operator calls clear() on *this before calling swap(*this, copy). Since this PR correctly removed noexcept from swap (because swap can now throw std::bad_alloc through its vector allocations and map insertions), if swap throws after clear(), *this will be left in an empty, unusable state — neither the old content nor the new content.

The classic copy-and-swap idiom avoids this by not explicitly clearing before the swap: the old state is destroyed by the temporary's destructor after a successful swap. However, with the new shared-container semantics in this PR, directly using a by-value parameter for copy-assignment would create unintended container sharing on the way in.

One safe alternative is to check whether the copy completed before clearing:

Fusion& Fusion::operator=(const Fusion& other) {
  if (this != &other) {
    Fusion copy(other);   // can throw — *this unchanged
    // Only reach swap (which can throw) after copy succeeded.
    // If swap throws here, *this is empty; consider moving swap-on-success
    // into a noexcept helper or accepting this as a medium-risk invariant.
    clear();
    swap(*this, copy);
  }
  return *this;
}

At minimum this should be documented as "basic exception safety only" — if swap throws, *this is valid but empty.

Copy constructor now shares the source's container pointer instead of creating a new one. Fusion::copy clones directly from per-Fusion filtered vals rather than delegating to IrContainer::copy. Swap changed from content-based (IrContainer::swap) to pointer-based with per-Fusion ownership tracking for both same-container and different-container cases.

Move val/expr name counters from IrContainer to Fusion so each Fusion independently tracks name assignment. This fixes CI failures where Fusion::copy left the dest counter at N (number of cloned vals) instead of max(name)+1 when source names were non-sequential, causing newly created TVs to collide with existing names. The fix adds val_type_name_map_ and expr_name_counter_ to Fusion, and updates registerVal/registerExpr to use the Fusion-level counters. Fusion::copy syncs counters from source to dest after cloning. Fusion::swap exchanges counters. Fusion::clear resets them.

…rom move ops

greptile-apps · 2026-03-05T00:20:36Z

Additional Comments (3)

csrc/fusion.cpp, line 208
Fusion::copy is not self-copy safe. Calling it with from == to (the same pointer) would silently clear the source before cloning begins, producing an empty Fusion.

Although the copy constructor and copy assignment operator guard against this scenario, Fusion::copy is a public static method with no precondition enforcement at the call site. A defensive assertion would prevent a subtle footgun:

IrCloner Fusion::copy(const Fusion* from, Fusion* to) {
  NVF_ERROR(from != to, "Fusion::copy: from and to must be different Fusions");
  to->clear();

csrc/fusion.cpp, line 321
Copy constructor relies on an implicit ordering constraint: ir_container_->addFusion(this) must be called before Fusion::copy(&other, this) so that this is registered in the container's sharing_fusions_ set before clear() is invoked.

This works for the current implementation (since this is freshly constructed with no statements), but future refactoring or subclassing could break this if the order changes or this is non-empty at the time of the call. Consider adding an explicit comment documenting this invariant:

Fusion::Fusion(const Fusion& other) : ir_container_(other.ir_container_) {
  FUSER_PERF_SCOPE("Fusion copy");
  ir_container_->addFusion(this);
  // NOTE: Fusion::copy begins with to->clear() which calls
  // ir_container_->removeStatementsOwnedBy(this). Since this was just
  // registered with addFusion above and has no statements yet, the
  // clear is a no-op. This ordering must be preserved.
  Fusion::copy(&other, this);
}

csrc/fusion.cpp, line 227
After Fusion::copy, the source and destination share name counter state. This is intentional — it ensures GreedyParams::at(tv->name()) and normalization_utils can perform cross-Fusion name lookups with matching names (e.g., source's T0 matches dest's T0).

However, a side-effect is that names are unique within a single Fusion but not within the shared IrContainer. If both Fusions independently create new vals post-copy, they each receive the same next name, producing duplicate (ValType, name) pairs in the container. Consider adding an inline comment to document this invariant:

  // After this point, both source and dest Fusions share the same name counter
  // state. Names are unique within a single Fusion's val set but NOT unique
  // across all vals in the shared IrContainer. Cross-Fusion name lookups
  // (e.g. GreedyParams, normalization_utils) rely on this matching.
  to->val_type_name_map_ = from->val_type_name_map_;
  to->expr_name_counter_ = from->expr_name_counter_;

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

greptile-apps · 2026-03-05T00:58:23Z

Additional Comments (1)

csrc/fusion.cpp, line 342
The explicit clear() on line 339 weakens exception safety in the copy-and-swap idiom. If swap throws (e.g., std::bad_alloc in the vector allocations at lines 125–130), *this has already been cleared and its original content is permanently lost.

The standard idiom should be:

Fusion& Fusion::operator=(const Fusion& other) {
  if (this != &other) {
    Fusion copy(other);
    swap(*this, copy);  // On throw: *this still has original state intact
    // copy destructs → calls clear() + removeFusion → cleans up *this's old data
  }
  return *this;
}

Without the explicit clear(), if swap throws before modifying *this's state, the original content remains intact. The temporary's destructor then handles cleanup of *this's old data that was swapped into it.

mdavis36 · 2026-03-05T01:02:06Z

!test

mdavis36 changed the title ~~[IR Container] Phase 2 Copy-Move Semantics~~ [IR Container] Phase 2.5 Copy-Move Semantics Feb 18, 2026

mdavis36 force-pushed the md/phase2-copy-move branch from 192fd55 to 35b7405 Compare February 18, 2026 03:13

mdavis36 force-pushed the md/phase2-per-fusion branch from 33629cb to 8b162d9 Compare February 18, 2026 03:13

mdavis36 mentioned this pull request Feb 18, 2026

[IR Container] Phase 2 IR Container Refactor #5975

Draft

mdavis36 marked this pull request as ready for review February 18, 2026 06:37

greptile-apps bot reviewed Feb 18, 2026

View reviewed changes

csrc/fusion.cpp Show resolved Hide resolved

csrc/fusion.cpp Outdated Show resolved Hide resolved

csrc/fusion.h Show resolved Hide resolved

mdavis36 force-pushed the md/phase2-copy-move branch from 35b7405 to 88b2e60 Compare February 26, 2026 00:29

mdavis36 force-pushed the md/phase2-per-fusion branch from 8b162d9 to b8d202d Compare February 26, 2026 00:29

mdavis36 mentioned this pull request Feb 26, 2026

[IR Container] Phase 2.4 Per-fusion statement tracking #5961

Open

mdavis36 force-pushed the md/phase2-copy-move branch from 88b2e60 to 9737ff1 Compare March 3, 2026 01:08

mdavis36 force-pushed the md/phase2-per-fusion branch from bc595c5 to 9f944b4 Compare March 3, 2026 02:51

mdavis36 force-pushed the md/phase2-copy-move branch from a9c62ea to 46080be Compare March 3, 2026 02:51

mdavis36 force-pushed the md/phase2-per-fusion branch from 9f944b4 to 7b3ce1f Compare March 4, 2026 22:04

mdavis36 force-pushed the md/phase2-copy-move branch from e5256e7 to 058a980 Compare March 4, 2026 23:09

mdavis36 force-pushed the md/phase2-per-fusion branch from 7b3ce1f to 2e491e9 Compare March 4, 2026 23:09

mdavis36 force-pushed the md/phase2-per-fusion branch from 2e491e9 to 50cb886 Compare March 5, 2026 00:07

mdavis36 added 9 commits March 4, 2026 16:07

fix: move expr_name_counter_ sync to after expr cloning in Fusion::copy

df4ab74

fix: enforce non-null ir_container_ invariant in Fusion::swap

fd5693e

fix: remove noexcept from Fusion::swap

fdddb5b

style: simplify getValName to use operator[] directly

15558ba

fix: use deterministic_vals in expr wiring loop and remove noexcept f…

6d9bb48

…rom move ops

Move assignment op control flow; comment.

e4cf268

Guard CpyCtor when other == this

ba3f1a7

mdavis36 force-pushed the md/phase2-copy-move branch from d145d9e to 3b9acdb Compare March 5, 2026 00:07

lint

c8ffe1d

mdavis36 force-pushed the md/phase2-copy-move branch from 3b9acdb to c8ffe1d Compare March 5, 2026 00:47

Conversation

mdavis36 commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Copy Semantics in Detail

Swap: Three Cases

Why Name Counters Were Merged Into This PR

Relationship to Phase 2

CI Risk

Uh oh!

mdavis36 commented Feb 12, 2026

Uh oh!

github-actions bot commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Changes walkthrough

PR Reviewer Guide

Uh oh!

mdavis36 commented Feb 18, 2026

Uh oh!

mdavis36 commented Feb 18, 2026

Uh oh!

greptile-apps bot commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 3/5

Sequence Diagram

Uh oh!

greptile-apps bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mdavis36 commented Feb 18, 2026

Uh oh!

greptile-apps bot commented Mar 3, 2026

Uh oh!

mdavis36 commented Mar 3, 2026

Uh oh!

greptile-apps bot commented Mar 3, 2026

Uh oh!

greptile-apps bot commented Mar 3, 2026

Uh oh!

mdavis36 commented Mar 3, 2026

Uh oh!

greptile-apps bot commented Mar 3, 2026

Uh oh!

mdavis36 commented Mar 3, 2026

Uh oh!

greptile-apps bot commented Mar 3, 2026

Uh oh!

greptile-apps bot commented Mar 4, 2026

Uh oh!

greptile-apps bot commented Mar 4, 2026

Uh oh!

greptile-apps bot commented Mar 5, 2026

Uh oh!

greptile-apps bot commented Mar 5, 2026

Uh oh!

mdavis36 commented Mar 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mdavis36 commented Feb 12, 2026 •

edited

Loading

github-actions bot commented Feb 12, 2026 •

edited

Loading

greptile-apps bot commented Feb 18, 2026 •

edited

Loading

greptile-apps bot left a comment •

edited

Loading