Skip to content

[mypyc] Subclass __mypyc_defaults_setup drops inherited class-attribute initializers across incremental builds #21542

@georgesittas

Description

@georgesittas

Bug Report

When mypyc rebuilds a subclass and the subclass's base class is loaded from mypy's incremental cache (rather than freshly parsed), the subclass's emitted __mypyc_defaults_setup drops every inherited class-attribute initializer. The subclass's slots for those attributes stay at the "undefined" sentinel and any access through compiled code raises AttributeError: attribute '<name>' of '<base>' undefined.

The condition shows up whenever a mypycify(separate=True) project gets built twice in a row with a content change to a base class in between. This is what pip wheel does when it runs setup.py for metadata and then again for the actual build.

To Reproduce

pkg/__init__.py (empty), pkg/base.py, pkg/sub.py, setup.py:

# pkg/base.py
class Base:
    ATTR_00: bool = False
    ATTR_01: bool = False

    def use_attr(self) -> int:
        if self.ATTR_00:
            return 1
        return 0
# pkg/sub.py
from pkg.base import Base


class Sub(Base):
    # An override forces Sub to emit its own __mypyc_defaults_setup.
    # Without it, Sub falls back to Base's via the type's MRO and the
    # dropped inherited default is silently masked.
    ATTR_01: bool = True
# setup.py
from setuptools import setup
from mypyc.build import mypycify

setup(
    name="pkg",
    packages=["pkg"],
    ext_modules=mypycify(["pkg/base.py", "pkg/sub.py"], separate=True),
)
# 1. clean build
rm -rf build .mypy_cache pkg/*.so
python setup.py build_ext --inplace

# 2. any content change to base.py, plus touch all sources
printf "\n" >> pkg/base.py
find pkg -name '*.py' -exec touch {} +

# 3. two consecutive build_ext invocations
python setup.py build_ext --inplace
python setup.py build_ext --inplace

# 4. crash
python -c "from pkg.sub import Sub; Sub().use_attr()"

Both setup.py invocations in step 3 are needed; a single build_ext against the same edit produces a correct binary. The touch is also needed, because without it distutils doesn't see anything out of date and the second invocation is a no-op.

Expected Behavior

Sub().use_attr() returns 0, same as the clean build.

Actual Behavior

Traceback (most recent call last):
  File "<string>", line 1, in <module>
    from pkg.sub import Sub; Sub().use_attr()
                             ~~~~~~~~~~~~~~^^
  File "pkg/base.py", line 6, in use_attr
    if self.ATTR_00:
AttributeError: attribute 'ATTR_00' of 'Base' undefined

Root cause (Claude-aided)

find_attr_initializers in mypyc/irbuild/classdef.py collects class-attribute default initializers by walking the MRO and iterating each entry's info.defn.defs.body:

for info in reversed(cdef.info.mro):
    if info not in builder.mapper.type_to_ir:
        continue
    for stmt in info.defn.defs.body:
        ...
        attrs_with_defaults.add(name)
        default_assignments.append((stmt, info.module_name))

This works when the base class is in the same compilation pass as the subclass, because its ClassDef.defs.body is the freshly parsed AST and contains the AssignmentStmt nodes.

It does not, however, work when the base class is loaded from mypy's incremental cache. ClassDef.serialize explicitly does not serialize the body, and ClassDef.deserialize always reconstructs the class with an empty Block([]):

def serialize(self) -> JsonDict:
    # Not serialized: defs, base_type_exprs, metaclass, decorators,
    # analyzed (for named tuples etc.)
    return {
        ".class": "ClassDef",
        "name": self.name,
        "fullname": self.fullname,
        "type_vars": [v.serialize() for v in self.type_vars],
    }

@classmethod
def deserialize(cls, data: JsonDict) -> ClassDef:
    res = ClassDef(
        data["name"],
        Block([]),
        ...
    )

So for any cache-loaded class, info.defn.defs.body == []. find_attr_initializers happily iterates zero statements and produces an empty default_assignments list. generate_attr_defaults_init then emits a __mypyc_defaults_setup that only sets the subclass's own overrides; every inherited attribute slot stays at the undefined sentinel.

The 2-invocation pattern is what flips Base from "freshly parsed" to "cache-loaded":

  1. First build_ext: the source change to base.py makes mypy recheck base, run mypyc against the freshly parsed AST, and write the incremental cache. sub is fresh (its content didn't change), so it's loaded from cache and its existing .so is left alone.
  2. Second build_ext: mypy now considers base stable and loads it from the cache instead of re-parsing it, so base's ClassDef.defs.body is empty. Meanwhile distutils sees sub.py's newer mtime and triggers a rebuild of sub through mypyc. find_attr_initializers(Sub) walks Sub's MRO, hits cache-loaded Base with an empty body, and produces the broken __mypyc_defaults_setup.

pip wheel <src> --no-build-isolation falls into this pattern naturally: it imports setup.py once for metadata (which runs mypycify because it sits inside ext_modules=) and once for the build itself. That's how we hit this in practice: pip wheel against a source tree whose base classes had recent edits produces wheels that crash at runtime on any read of an inherited class attribute.

The reason Sub needs its own override in the reproducer is that, without one, Sub has no class-body assignments at all, so mypyc emits no __mypyc_defaults_setup for it and the instance falls back to Base's via the type's MRO. The bug is still there in that case (Sub's defaults_setup is still missing inherited inits) but the masking effect of MRO fallback hides it at runtime.

Possible fixes

  1. mypyc — chain the call. Make every subclass __mypyc_defaults_setup start with super().__mypyc_defaults_setup(self) and then only emit its own overrides. The subclass no longer needs to know its base's defaults at IR-build time. Inherited defaults are set at runtime by walking up the type chain, regardless of whether the base's AST is in memory. Cleanest fix and mirrors how Python's __init__ chains already work.
  2. mypyc — read from ClassIR when AST is missing. When info.defn.defs.body is empty but info in mapper.type_to_ir, recover inherited defaults from the cached ClassIR. Requires the default values to live somewhere on the ClassIR (today only the names are stored in attrs_with_defaults), so this implies extending the IR serialization.
  3. mypy — serialize the body assignments. Persist the AssignmentStmt nodes for class-body attribute declarations as part of ClassDef.serialize. Smallest invariant change for callers but increases cache size and only addresses this particular consumer of defn.defs.body.

Your Environment

  • Mypy version used: 2.2.0+dev.965dd31224bc2a0694e7343f927ff9c164b4b673 (master)
  • Mypy command-line flags: none directly; invoked via mypycify(["pkg/base.py", "pkg/sub.py"], separate=True) from setup.py
  • Mypy configuration options: none
  • Python version used: 3.13.12
  • OS: macOS 26.0 (arm64); also reproduces on Linux x86_64 in CI

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugmypy got something wrong
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions