Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1170 +/- ##
==========================================
+ Coverage 87.30% 87.35% +0.04%
==========================================
Files 57 57
Lines 7706 7719 +13
Branches 7706 7719 +13
==========================================
+ Hits 6728 6743 +15
+ Misses 673 671 -2
Partials 305 305 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
This PR changes the default behavior of the model “ironing out” loop by reducing max_ironing_out_iterations from 10 to 1, and updates expected regression outputs accordingly. It also extends the patched-example mechanism so a patched example can optionally include a model.toml patch (intended to preserve coverage of the loop-enabled configuration via a patched “simple” model).
Changes:
- Set the default
max_ironing_out_iterationsto1(and update the input schema default accordingly). - Extend the patched-example registry to include an optional TOML patch, adding a
simple_ironing_outpatched example that sets iterations back to10. - Refresh a large set of regression “golden” CSV outputs to match the new default behavior; add new golden outputs for
simple_ironing_out.
Reviewed changes
Copilot reviewed 25 out of 26 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/data/two_regions/commodity_prices.csv | Updated expected commodity price outputs under the new default iteration count. |
| tests/data/two_regions/assets.csv | Updated expected asset outputs under the new default iteration count. |
| tests/data/two_outputs/commodity_prices.csv | Updated expected commodity price outputs under the new default iteration count. |
| tests/data/two_outputs/assets.csv | Updated expected asset outputs under the new default iteration count. |
| tests/data/simple_npv/assets.csv | Updated expected asset outputs for the NPV patched scenario under the new default. |
| tests/data/simple_ironing_out/commodity_prices.csv | New golden output for the patched “simple” model with ironing-out enabled (iterations=10). |
| tests/data/simple_ironing_out/commodity_flows.csv | New golden output for the patched “simple” model with ironing-out enabled (iterations=10). |
| tests/data/simple_ironing_out/assets.csv | New golden output for the patched “simple” model with ironing-out enabled (iterations=10). |
| tests/data/simple/debug_solver.csv | Updated debug solver expectations reflecting fewer iterations by default. |
| tests/data/simple/debug_dispatch_assets.csv | Updated debug dispatch expectations reflecting fewer iterations by default. |
| tests/data/simple/debug_commodity_balance_duals.csv | Updated debug dual expectations reflecting fewer iterations by default. |
| tests/data/simple/debug_appraisal_results.csv | Updated appraisal debug expectations reflecting fewer iterations by default. |
| tests/data/muse1_default/commodity_prices.csv | Updated expected commodity price outputs under the new default iteration count. |
| tests/data/muse1_default/commodity_flows.csv | Updated expected commodity flow outputs under the new default iteration count. |
| tests/data/muse1_default/assets.csv | Updated expected asset outputs under the new default iteration count. |
| tests/data/circularity/commodity_prices.csv | Updated expected commodity price outputs under the new default iteration count. |
| tests/data/circularity/assets.csv | Updated expected asset outputs under the new default iteration count. |
| src/model/parameters.rs | Changes the default max_ironing_out_iterations from 10 to 1. |
| src/example/patches.rs | Extends patch registry to include optional TOML patch; adds simple_ironing_out patch entry. |
| src/cli/example.rs | Updates CLI patched-example extraction to apply file patches + optional TOML patch. |
| schemas/input/model.yaml | Updates schema default for max_ironing_out_iterations to 1. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
| fn extract_example(name: &str, patch: bool, dest: &Path) -> Result<()> { | ||
| if patch { | ||
| let patches = get_patches(name)?; | ||
| let (file_patches, toml_patch) = get_patches(name)?; |
There was a problem hiding this comment.
get_patches(name)? returns a reference to a (Vec<FilePatch>, Option<String>), but this code destructures it as if it were an owned tuple. This won’t compile (type mismatch &(…) vs (…)). Either bind the returned reference and access its fields, or change get_patches to return an owned tuple (e.g., cloned) so destructuring works.
| let (file_patches, toml_patch) = get_patches(name)?; | |
| let (file_patches, toml_patch) = get_patches(name)?.clone(); |
There was a problem hiding this comment.
Well it does compile...
|
@alexdewar Now getting slightly different results on macos which is concerning. I seem to remember something like this has happened before...? |
These differences are pretty large too: Not sure what's going on here. If you want to compare the macOS results with e.g. the Windows ones, we upload them as test artifacts you can see on the summary page: https://github.com/EnergySystemsModellingLab/MUSE2/actions/runs/22673536177?pr=1170 |
|
The difference concerns electricity supply in the night timeslice in 2045 by two assets (8 and 25):
These are both gasCCGT assets in R2, and although the commission year is different, the parameters for these assets are identical. Therefore, as far as the objective is concerned, the two solutions are equally good. It's probably the case for a lot of optimisations that there's no single optimal solution, rather a space of equally good solutions, but 99.9% of the time highs seems to pick the same solution every time, regardless of OS (I think this always ends up being a "corner" solution, i.e. where one or more parameters in the space is at its limit. The two solutions above are different corner solutions where supply by asset 25 is at its upper and lower limit respectively). I don't know the factors that cause it to choose one optimum over another, but it seems that tiny floating-point differences between platforms can have an impact (perhaps tilting the energy landscape slightly so that one solution becomes ever-so-slightly favourable over another, or affecting pivot decisions in the optimisation path so it lands on a different vertex). The best solution I can think of is to modify the objective so that we can guarantee a single optimum solution. One idea would be L2 regularisation, adding a small epsilon-scaled term to the objective to minimise the sum of squares of the variables. Generally this favours smaller values over large "extreme" values, which for our purposes means spreading activity over multiple assets rather than allocating all activity to a single asset - probably not a bad thing to aim for in any case. (In this particular example, the mac solution would be the single unique optimum). In practice I'm not sure how easy this would be. Apart from that I don't really have any other ideas... |
|
Annoying that this PR is getting blocked by something completely unrelated! For the time being, do you think it would be ok to turn off this particular regression test until we can get this fixed? What else can we do...? |
|
Or revert this particular example model back to using the ironing out loop and update it later? |
This sounds plausible.
This is an interesting idea, but probably quite a lot of work and, as you say, out of scope for this PR anyway. As an aside, you can actually call the functions in the lower-level
How about we just skip this test on non-x86 platforms for now? This should work: diff --git a/tests/regression.rs b/tests/regression.rs
index 41c933b9..4307bc3e 100644
--- a/tests/regression.rs
+++ b/tests/regression.rs
@@ -26,6 +26,9 @@ define_regression_test!(circularity);
// Patched examples
define_regression_test_with_patches!(simple_divisible);
define_regression_test_with_patches!(simple_npv);
+
+// We get different results on ARM Macs, for reasons that aren't clear
+#[cfg(target_arch = "x86_64")]
define_regression_test_with_patches!(simple_ironing_out);
// ------ END: regression tests ------It would be better if we opened an issue and linked to it in the comment. It would be nice if the examples gave similar results on all the platforms we care about, but it probably shouldn't be a high priority to fix this. But let's keep track of it anyway! |
|
I'm going to hold this until #1173. That will be another major change to all of the example models, and there's every chance that that will magically fix the issue here. (Or, it could just as likely introduce more issues just like this...) In any case, this does reveal an underlying issue, so I've opened #1174 |
Description
Turns off the ironing out loop by default (
max_ironing_out_iterations = 1), as requested by @ahawkes, but maintains a regression test for a patched version of the simple model with the loop turned on (iterations = 10, the previous default). To do this I had to modify the patching API to allow patching the example model'smodel.toml(this will also be useful for #1105)Turns out for the "simple" example this doesn't have any impact, but for others it does
Fixes #1165
Type of change
Key checklist
$ cargo test$ cargo docpresent in the previous release
Further checks