Add BHCToAVS model for patient-friendly summaries #730

charanw · 2025-12-10T02:16:05Z

Contributor: Charan Williams (charanw2@illinois.edu)

Contribution Type: New Model

Description:
Added a new clinical summarization model, BHCToAVS, which converts Brief Hospital Course (BHC) notes into patient-friendly After-Visit Summaries (AVS). The model wraps a fine-tuned Mistral-7B LoRA adapter hosted on Hugging Face and integrates with the PyHealth model API. This contribution includes the full model implementation, unit tests, documentation, and an example usage script.

Files to Review:

pyhealth/models/bhc_to_avs.py — Main model implementation
pyhealth/models/__init__.py — Added import for the new model
tests/core/test_bhc_to_avs.py — Unit test for the BHCToAVS model
docs/api/models/pyhealth.models.bhc_to_avs.rst — Sphinx documentation file
docs/api/models.rst — Updated model index to include BHCToAVS
examples/bhc_to_avs_example.py — Example usage demonstrating model prediction

Introduces the BHCToAVS model, which converts clinical Brief Hospital Course (BHC) notes into After-Visit Summaries (AVS) using a fine-tuned Mistral 7B model with LoRA adapters. Adds model implementation, documentation, an example usage script, and unit tests.

Copilot

Pull request overview

This PR adds a new clinical summarization model, BHCToAVS, that converts Brief Hospital Course (BHC) notes into patient-friendly After-Visit Summaries (AVS) using a fine-tuned Mistral-7B LoRA adapter. The implementation integrates with PyHealth's model API and includes comprehensive documentation and examples.

Implements a new text generation model wrapping a Hugging Face LoRA adapter
Adds unit tests with graceful handling of model download failures
Provides example usage demonstrating the model's clinical text summarization capabilities

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 9 comments.

Show a summary per file

File	Description
pyhealth/models/bhc_to_avs.py	Core model implementation with predict() method for generating patient-friendly summaries
pyhealth/models/init.py	Added BHCToAVS to module exports
tests/core/test_bhc_to_avs.py	Unit test validating the predict method with error handling for gated models
docs/api/models/pyhealth.models.BHCToAVS.rst	Sphinx autodoc configuration for the new model
docs/api/models.rst	Updated model index to include BHCToAVS
examples/bhc_to_avs_example.py	Example script demonstrating model usage with synthetic clinical text

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

tests/core/test_bhc_to_avs.py

Copilot · 2025-12-27T10:29:32Z

pyhealth/models/bhc_to_avs.py

+                "text-generation",
+                model=model,
+                tokenizer=tokenizer,
+                device_map="auto",


The pipeline is created with device_map="auto" parameter twice: once in the AutoModelForCausalLM.from_pretrained call (line 48) and again in the pipeline constructor (line 65). The second device_map parameter in the pipeline call is redundant since the model has already been placed on devices, and may cause conflicts or unexpected behavior.

Suggested change

device_map="auto",

Copilot · 2025-12-27T10:29:33Z

pyhealth/models/bhc_to_avs.py

+# Author: Charan Williams
+# NetID: charanw2
+# Description: Converts clinical brief hospital course (BHC) data to after visit summaries using a fine-tuned Mistral 7B model.


The module docstring header uses "# Description:" format which is not standard Python docstring style. The description should either be a proper module-level docstring (triple-quoted string) or follow a consistent comment format without the "Description:" label.

Suggested change

# Author: Charan Williams

# NetID: charanw2

# Description: Converts clinical brief hospital course (BHC) data to after visit summaries using a fine-tuned Mistral 7B model.

"""Convert clinical brief hospital course (BHC) data to after-visit

summaries using a fine-tuned Mistral 7B model."""

# Author: Charan Williams

# NetID: charanw2

Copilot · 2025-12-27T10:29:33Z

pyhealth/models/bhc_to_avs.py

+@dataclass
+class BHCToAVS(BaseModel):
+    base_model_id: str = field(default="mistralai/Mistral-7B-Instruct")
+    """HuggingFace repo containing the base Mistral 7B model."""
+
+    adapter_model_id: str = field(default="williach31/mistral-7b-bhc-to-avs-lora")
+    """HuggingFace repo containing only LoRA adapter weights."""
+


Missing documentation for the BHCToAVS class itself. The class lacks a docstring explaining its purpose, parameters, and usage. Only the individual fields and methods have documentation.

Copilot · 2025-12-27T10:29:33Z

pyhealth/models/bhc_to_avs.py

+        str
+            Patient-friendly summary.
+        """
+


Missing input validation for the bhc_text parameter. The method should validate that bhc_text is not None and is a non-empty string before processing to provide clearer error messages to users.

Suggested change

# Validate input to provide clear error messages and avoid unexpected failures.

if bhc_text is None:

raise ValueError("bhc_text must not be None.")

if not isinstance(bhc_text, str):

raise TypeError(f"bhc_text must be a string, got {type(bhc_text).__name__}.")

if not bhc_text.strip():

raise ValueError("bhc_text must be a non-empty string.")

Copilot · 2025-12-27T10:29:33Z

pyhealth/models/bhc_to_avs.py

+@dataclass
+class BHCToAVS(BaseModel):


The dataclass decorator on a class inheriting from BaseModel (which inherits from nn.Module) may not properly initialize the parent class. The dataclass-generated init should include a post_init method that calls super().init() to ensure nn.Module is properly initialized. Without this, features like the _dummy_param used for device detection may not work correctly.

Copilot · 2025-12-27T10:29:34Z

pyhealth/models/bhc_to_avs.py

+_PROMPT = """Summarize for the patient what happened during the hospital stay:
+
+### Brief Hospital Course:
+{bhc}
+
+### Patient Summary:
+"""
+
+# System prompt used during inference
+_SYSTEM_PROMPT = (
+    "You are a clinical summarization model. Produce accurate, patient-friendly summaries "
+    "using only information from the doctor's note. Do not add new details.\n\n"
+)
+
+# Prompt used during fine-tuning
+_PROMPT = (
+    "Summarize for the patient what happened during the hospital stay based on this doctor's note:\n"
+    "{bhc}\n\n"
+    "Summary for the patient:\n"
+)


The _PROMPT variable is defined twice (lines 12-18 and lines 27-31), with the second definition overwriting the first. This creates dead code and potential confusion. Only one prompt definition should be kept, or they should be renamed to reflect their different purposes (e.g., _TRAINING_PROMPT and _INFERENCE_PROMPT).

Copilot · 2025-12-27T10:29:34Z

pyhealth/models/bhc_to_avs.py

+            max_new_tokens=512,
+            temperature=0.0,
+            eos_token_id=[pipe.tokenizer.eos_token_id],
+            pad_token_id=pipe.tokenizer.eos_token_id,


The pipeline is missing the return_full_text=False parameter in the generate call. By default, Hugging Face text-generation pipelines return the full text including the input prompt. To return only the newly generated text, you should either set return_full_text=False in the pipeline call or manually strip the prompt from the output.

Suggested change

pad_token_id=pipe.tokenizer.eos_token_id,

pad_token_id=pipe.tokenizer.eos_token_id,

return_full_text=False,

Copilot · 2025-12-27T10:29:34Z

pyhealth/models/bhc_to_avs.py

+# NetID: charanw2
+# Description: Converts clinical brief hospital course (BHC) data to after visit summaries using a fine-tuned Mistral 7B model.
+
+from typing import Dict, Any


Import of 'Dict' is not used.
Import of 'Any' is not used.

Suggested change

from typing import Dict, Any

Logiquo · 2025-12-27T10:34:01Z

The CI has failed.

Logiquo added the component: model Contribute a new model to PyHealth label Dec 18, 2025

Logiquo requested a review from Copilot December 27, 2025 10:24

Copilot started reviewing on behalf of Logiquo December 27, 2025 10:25 View session

Logiquo self-requested a review December 27, 2025 10:25

Copilot AI reviewed Dec 27, 2025

View reviewed changes

Logiquo added the status: wait response Pending PR author's response label Dec 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add BHCToAVS model for patient-friendly summaries #730

Add BHCToAVS model for patient-friendly summaries #730

charanw commented Dec 10, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Copilot AI Dec 27, 2025

Uh oh!

Copilot AI Dec 27, 2025

Uh oh!

Copilot AI Dec 27, 2025

Uh oh!

Copilot AI Dec 27, 2025

Uh oh!

Copilot AI Dec 27, 2025

Uh oh!

Copilot AI Dec 27, 2025

Uh oh!

Copilot AI Dec 27, 2025

Uh oh!

Copilot AI Dec 27, 2025

Uh oh!

Logiquo commented Dec 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

+        # Validate input to provide clear error messages and avoid unexpected failures.
+        if bhc_text is None:
+            raise ValueError("bhc_text must not be None.")
+        if not isinstance(bhc_text, str):
+            raise TypeError(f"bhc_text must be a string, got {type(bhc_text).__name__}.")
+        if not bhc_text.strip():
+            raise ValueError("bhc_text must be a non-empty string.")

	pad_token_id=pipe.tokenizer.eos_token_id,
	pad_token_id=pipe.tokenizer.eos_token_id,
	return_full_text=False,

Add BHCToAVS model for patient-friendly summaries #730

Are you sure you want to change the base?

Add BHCToAVS model for patient-friendly summaries #730

Conversation

charanw commented Dec 10, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Copilot AI Dec 27, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 27, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 27, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 27, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 27, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 27, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 27, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 27, 2025

Choose a reason for hiding this comment

Uh oh!

Logiquo commented Dec 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants