delete_file_on_restart: false wrongly stops scans from resuming with the hdf5_v1 printer#485
delete_file_on_restart: false wrongly stops scans from resuming with the hdf5_v1 printer#485
delete_file_on_restart: false wrongly stops scans from resuming with the hdf5_v1 printer#485Conversation
…emp' to '_combined'. This way a user don't risk deleting both the combined file and all the temporary files if they do e.g. 'rm *.hdf5_temp_*' after running the combine script.
|
I've tested this and your fix works for me. I would be happy to merge for this issue. I did however, find another potential issue: If we allow the combine routines (which admittedly we rarely do for large scans), the combined file does not seem to be named what is expected when it looks for a file. It expected "_combined" on the end. This will mean that running the case below will end up restarting the scan and overwriting old data. This should be a simple fix to either make sure the combine script adds the _combined part, or change what is looked for in the hdf5 printer. Test case (spartan.yaml):
This will give a message about assuming it is a new scan, and the data will be different for the first generation. |
| // Name of file where combined results from previous (unfinished) runs end up | ||
| std::ostringstream rename; | ||
| rename << finalfile << "_temp_combined"; | ||
| rename << finalfile << "_combined"; |
There was a problem hiding this comment.
This is probably the relevant part for my comment about restarting overwriting data.
| "\n ascii file, on separate lines." | ||
| "\n In this mode <filename> will only be used for naming the output" | ||
| "\n file (Note: WITHOUT the usual addition of '_temp_combined'!!!)." | ||
| "\n file (Note: WITHOUT the usual addition of '_combined'!!!)." |
There was a problem hiding this comment.
Does this mean that my naive restart test that doesn't include the _combined name is using the combine script incorrectly?
There was a problem hiding this comment.
Is this actually the combine code that GAMBIT runs? Looks to me like this is just a standalone program -- and one we don't really use, since we typically use the Python script?
There was a problem hiding this comment.
From the Core Meeting: This file never gets run, and can probably be deleted/moved to code storage.
|
Note from Core meeting: I will get back into looking at this, starting with merging the latest master into this PR branch. |
Due to a double if-check of the form
in
hdf5printer.cpp, gambit would refuse to resume a job whendelete_file_on_restart: false, even though this setting should only matter when gambit is run with the restart (-r) flag. This PR changes that if-else check to a nested one, similar to how it works for the hdf5 v2 printer.