fix: resolve evaluation metrics bugs in Classification_Transformers#229
Open
rixav77 wants to merge 1 commit into
Open
fix: resolve evaluation metrics bugs in Classification_Transformers#229rixav77 wants to merge 1 commit into
rixav77 wants to merge 1 commit into
Conversation
Fixes ML4SCI#192 - Add missing micro_auroc computation (was initialized but never populated, causing np.mean([]) to silently return NaN) - Add explicit dim=-1 to softmax call in ROC curve plotting to match the correct usage elsewhere and suppress deprecation warning - Replace hardcoded W&B entity "_archil" with configurable --entity CLI arg that falls back to WANDB_ENTITY env var, allowing other contributors to use their own W&B accounts
There was a problem hiding this comment.
Pull request overview
Fixes incorrect and/or unusable evaluation logging in DeepLense_Classification_Transformers_Archil_Srivastava by addressing missing metric computation, a PyTorch softmax API misuse, and hardcoded W&B configuration.
Changes:
- Compute and log
micro_aurocduring evaluation (previously alwaysNaNdue to an empty list). - Specify
dim=-1in the ROC softmax call to avoid deprecated/ambiguous behavior. - Replace hardcoded W&B
entitywith a configurable--entityCLI argument (defaulting toWANDB_ENTITY).
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| DeepLense_Classification_Transformers_Archil_Srivastava/eval.py | Adds missing micro_auroc, fixes softmax(..., dim=-1), and makes W&B entity configurable for evaluation runs. |
| DeepLense_Classification_Transformers_Archil_Srivastava/train.py | Makes W&B entity configurable for training runs (removes hardcoded entity). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| @@ -84,6 +85,7 @@ def evaluate(model, data_loader, loss_fn, device): | |||
| # Wandb-specific params | |||
| parser.add_argument("--runid", type=str, help="ID of train run") | |||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #192
Three bugs in the evaluation pipeline of
DeepLense_Classification_Transformers_Archil_Srivastavaproduce incorrect metrics:Bug 1:
micro_aurocalways NaNThe
micro_auroclist is initialized but never populated —np.mean([])silently returnsnan. Added the missingauroc_fn(..., average="micro")call.Bug 2: Missing
softmaxdimensiontorch.nn.functional.softmax(metrics["logits"])on line 172 omits thedimargument, triggering a deprecation warning and potentially incorrect behavior. Line 158 already usesdim=-1correctly — applied the same fix for consistency.Bug 3: Hardcoded W&B entity
entity="_archil"is hardcoded in bothtrain.pyandeval.py, causing authentication errors for other contributors. Replaced with a configurable--entityCLI argument that defaults to theWANDB_ENTITYenvironment variable (orNoneif unset, which lets W&B use the logged-in user's default entity).Changes
eval.py: Addmicro_auroccomputation, adddim=-1to softmax, add--entityargtrain.py: Add--entityarg, replace hardcoded entityTest plan
micro_aurocis no longer NaN after evaluation--entityflag (defaults to logged-in W&B user)--entity my_teamoverrides correctly