Skip to content

[Bug] GRASP model broken in 2.0.0: Incompatible with new BaseModel API (unexpected keyword argument 'feature_keys') #891

@luancaarvalho

Description

@luancaarvalho

When attempting to use the GRASP model from the pip package pyhealth==2.0.0, the model fails to instantiate.

It appears that the published package contains an inconsistent combination of:

  • a newer BaseModel API
  • an older GRASP implementation that still depends on legacy BaseModel methods and constructor arguments

I was able to reproduce this both:

  1. in our main project environment
  2. in a fresh clean Miniconda environment created only to validate GRASP

To Reproduce

  1. Create a clean environment
  2. Install PyHealth from PyPI:
    pip install pyhealth==2.0.0
  3. Inspect the model / base class APIs or try to instantiate GRASP

Minimal example:

import inspect
import pyhealth
from pyhealth.models import GRASP, base_model

print("pyhealth.__version__ =", pyhealth.__version__)
print("GRASP.__init__ =", inspect.signature(GRASP.__init__))
print("BaseModel.__init__ =", inspect.signature(base_model.BaseModel.__init__))
print("has get_label_tokenizer =", hasattr(base_model.BaseModel, "get_label_tokenizer"))
print("has prepare_labels =", hasattr(base_model.BaseModel, "prepare_labels"))
print("has add_feature_transform_layer =", hasattr(base_model.BaseModel, "add_feature_transform_layer"))

Then try:

from pyhealth.models import GRASP

model = GRASP(
    dataset=dataset,
    feature_keys=["list_codes"],
    label_key="label",
    use_embedding=[True],
    mode="binary",
)

Actual Behavior

The model crashes during initialization.

First error:

TypeError: BaseModel.__init__() got an unexpected keyword argument 'feature_keys'

If BaseModel.__init__ is monkey-patched to ignore extra kwargs, the next error appears:

AttributeError: 'GRASP' object has no attribute 'get_label_tokenizer'

The same legacy mismatch also affects other calls inside GRASP, for example:

  • self.add_feature_transform_layer(...)
  • self.get_output_size(self.label_tokenizer)
  • self.prepare_labels(...)

Expected Behavior

GRASP should initialize successfully and be compatible with the current PyHealth 2.0 model/data-processing stack.

Observed Root Cause

In the installed package, GRASP still behaves like a legacy 1.x-style model.

Inside pyhealth/models/grasp.py, GRASP.__init__ still does:

super(GRASP, self).__init__(
    dataset=dataset,
    feature_keys=feature_keys,
    label_key=label_key,
    mode=mode,
)

and later calls:

self.label_tokenizer = self.get_label_tokenizer()
self.add_feature_transform_layer(...)
output_size = self.get_output_size(self.label_tokenizer)
y_true = self.prepare_labels(kwargs[self.label_key], self.label_tokenizer)

However, the installed BaseModel only exposes the new API. In our clean environment:

inspect.signature(base_model.BaseModel.__init__)
# (self, dataset: pyhealth.datasets.sample_dataset.SampleDataset = None)

and the following methods are missing from BaseModel:

  • get_label_tokenizer
  • prepare_labels
  • add_feature_transform_layer

Additional Important Observation

In the clean test environment:

  • pip show pyhealth reports:
    • Version: 2.0.0
  • but after import:
    • pyhealth.__version__ == "1.1.4"

So there may also be a packaging/version inconsistency in the published wheel/sdist, not only a GRASP implementation bug.

Environment

  • OS: Ubuntu 24.04.1 LTS
  • Kernel: Linux 6.17.0-14-generic
  • Python: 3.12
  • Installation method: pip install pyhealth==2.0.0
  • Hardware used during validation: NVIDIA RTX PRO 6000 Blackwell Max-Q Workstation Edition
  • Reproduced in:
    • our main project environment
    • a fresh temporary Miniconda environment created only for this test

Why this matters

We are using PyHealth in a real longitudinal clinical prediction pipeline and validated several other models successfully, but GRASP is currently unusable because of this API mismatch. This blocks model benchmarking under the same 2.0 pipeline as the other architectures.

Suggested Fix

At least one of the following should happen upstream:

  1. Refactor pyhealth/models/grasp.py to fully match the 2.0 API

    • remove legacy arguments from super().__init__
    • remove use of legacy BaseModel helper methods
    • consume tensors/processors from the 2.0 data pipeline directly
  2. Or explicitly mark/disable GRASP as unsupported in the current 2.0 release until it is migrated

  3. Also verify the published PyPI package contents

    • because pip show pyhealth==2.0.0 but pyhealth.__version__ == "1.1.4" strongly suggests packaging inconsistency

If useful, I can also provide:

  • the exact stack traces
  • the minimal inspection script
  • the clean-environment reproduction commands
  • a proposed compatibility patch we tested locally

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions