Skip to content

Conversation

@phoeenniixx
Copy link
Member

There is an Unpickling Error in weight loading of pytorch-forecasting-v1 because the weights are not loaded correctly.
The issue seems to be something related to torch which is leading to fail in CI/CD pipeline.

The issue surfaced after the latest lightning update where they exposed the weights_only param and let torch handle it itself.

Fixes #1998

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@codecov
Copy link

codecov bot commented Dec 1, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (main@aae13ba). Learn more about missing BASE report.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2000   +/-   ##
=======================================
  Coverage        ?   86.99%           
=======================================
  Files           ?      160           
  Lines           ?     9480           
  Branches        ?        0           
=======================================
  Hits            ?     8247           
  Misses          ?     1233           
  Partials        ?        0           
Flag Coverage Δ
cpu 86.99% <100.00%> (?)
pytest 86.99% <100.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@fkiraly
Copy link
Collaborator

fkiraly commented Dec 3, 2025

hm, not yet, right?

@phoeenniixx
Copy link
Member Author

Yes, this PR is incomplete until we find a way to pass weights_only to tuner.

@fkiraly
Copy link
Collaborator

fkiraly commented Dec 3, 2025

would overriding tuner work? Feels dangerous, but if we are forced to?

try:
return super().lr_find(*args, **kwargs)
finally:
torch.load = original_load
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doing this feels dangerous, because this has side effects on everything!

I would not do this. Is there a way we can override only in a call, or override a method in a class?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have wrapped trainer.fit here then.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrapping around trainer.fit didn't work as lr_find calls trainer._checkpoint_connector.restore after fit that overrode the weights_only param from fit, so now I have wrapped load_checkpoint instead (its deeper in the traceback call - close to torch.load, but still in the lightning layer). I think this should solve the issue?

Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting find that this works!

Though I do not agree with the pattern - overriding the torch installation imo is a bad idea, as it can have unpredictable side effects, as it would override torch.load for any and every call.

Is there a way to avoid doing that, and instead injecting the override at call level?

@phoeenniixx phoeenniixx requested a review from fkiraly December 9, 2025 12:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

Development

Successfully merging this pull request may close these issues.

[BUG] Unpickling Error in weight loading

2 participants