-
Notifications
You must be signed in to change notification settings - Fork 24
Improves restart and TS guess handling #823
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Ensures internal consistency of the output dictionary after a restart, particularly regarding convergence status and output paths. Specifically, it introduces a `_missing_required_paths` method to check if a species has all required output paths, and a `_sanitize_restart_output` method that resets the convergence status if required paths are missing, preventing misleading convergence flags in subsequent calculations. Also resets output status when switching TS guesses or deleting all species jobs.
Addresses an issue where inconsistent transition state (TS) output during restart could lead to incorrect convergence status. This commit introduces a mechanism to sanitize the output and ensure accurate convergence determination upon restarting a project.
Avoids repeating methods in the TS guess report by ensuring uniqueness of successful and unsuccessful methods. This improves readability and prevents redundant information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR improves restart and transition state (TS) guess handling in ARC by ensuring convergence is correctly assessed when restarting from inconsistent output states, deduplicating TS method names in reports, and preventing incorrect convergence status when required output paths are missing.
Changes:
- Adds output sanitization on restart to detect and fix inconsistent convergence/path states
- Deduplicates TS guess method names in the TS report to improve readability
- Enhances convergence checking to verify required output paths exist before marking species as converged
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| arc/scheduler.py | Adds _sanitize_restart_output() and _missing_required_paths() methods; updates switch_ts() and delete_all_species_jobs() to properly reset output state including convergence, job_types, and paths |
| arc/species/species.py | Updates make_ts_report() to deduplicate successful and unsuccessful TS method lists using dict.fromkeys() before joining |
| arc/scheduler_test.py | Adds unit tests for path validation, restart sanitization, output reset on job deletion, and TS switching |
| functional/restart_test.py | Adds functional test to verify restart sanitization correctly resets convergence when paths are missing |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
arc/scheduler.py
Outdated
| for job_type, path_key in path_map.items(): | ||
| if job_type == 'composite': | ||
| required = self.composite_method is not None | ||
| else: | ||
| required = self.job_types.get(job_type, False) | ||
| if not required: | ||
| continue | ||
| if self.species_dict[label].number_of_atoms == 1 and job_type in ['opt', 'freq']: | ||
| continue | ||
| if not self.output[label]['paths'].get(path_key, ''): | ||
| self.output[label]['job_types'][job_type] = False |
Copilot
AI
Jan 26, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The _sanitize_restart_output method duplicates logic from _missing_required_paths. Lines 3145-3155 repeat the same path validation logic that's already in _missing_required_paths (lines 3117-3128). Consider refactoring to reuse the existing helper method or extracting the path validation into a shared utility function to improve maintainability and reduce duplication.
| if self.species_dict[label].number_of_atoms == 1 and job_type in ['opt', 'freq']: | ||
| continue | ||
| if not self.output[label]['paths'].get(path_key, ''): | ||
| self.output[label]['job_types'][job_type] = False |
Copilot
AI
Jan 26, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the _sanitize_restart_output method, there's no check for the existence of 'job_types' key before accessing it at line 3155. While _missing_required_paths at line 3126 uses .get() with a default value to safely access job_types, this line directly accesses self.output[label]['job_types'][job_type]. If 'job_types' doesn't exist in self.output[label], this will raise a KeyError. Consider using safe dictionary access or add a check for the existence of 'job_types' key.
| self.output[label]['job_types'][job_type] = False | |
| if 'job_types' in self.output[label]: | |
| self.output[label]['job_types'][job_type] = False |
| unique_successful_methods = list(dict.fromkeys(self.successful_methods)) | ||
| self.ts_report += ','.join(unique_successful_methods) | ||
| if self.unsuccessful_methods: | ||
| self.ts_report += '\nMethods that were unsuccessfully in generating a TS guess:\n' |
Copilot
AI
Jan 26, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Grammatical error: "were unsuccessfully in generating" should be "were unsuccessful in generating" (remove the 'ly').
| self.ts_report += '\nMethods that were unsuccessfully in generating a TS guess:\n' | |
| self.ts_report += '\nMethods that were unsuccessful in generating a TS guess:\n' |
arc/scheduler.py
Outdated
| if label in self.output: | ||
| self.output[label]['convergence'] = False | ||
| for key in ['opt', 'freq', 'sp', 'composite', 'fine']: | ||
| if key in self.output[label]['job_types']: | ||
| self.output[label]['job_types'][key] = False | ||
| if 'paths' in self.output[label]: | ||
| for key in self.output[label]['paths']: | ||
| self.output[label]['paths'][key] = '' if key != 'irc' else list() |
Copilot
AI
Jan 26, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is duplicated logic for resetting output between switch_ts (lines 2624-2631) and delete_all_species_jobs (lines 3620-3626). Both methods reset convergence, job_types, and paths in nearly identical ways. Since switch_ts calls delete_all_species_jobs at line 2623, consider removing the duplicate resetting logic from switch_ts and relying solely on delete_all_species_jobs to handle the output reset. This would improve maintainability and reduce the risk of the two implementations diverging.
| if label in self.output: | |
| self.output[label]['convergence'] = False | |
| for key in ['opt', 'freq', 'sp', 'composite', 'fine']: | |
| if key in self.output[label]['job_types']: | |
| self.output[label]['job_types'][key] = False | |
| if 'paths' in self.output[label]: | |
| for key in self.output[label]['paths']: | |
| self.output[label]['paths'][key] = '' if key != 'irc' else list() |
| unique_successful_methods = list(dict.fromkeys(self.successful_methods)) | ||
| self.ts_report += ','.join(unique_successful_methods) | ||
| if self.unsuccessful_methods: | ||
| self.ts_report += '\nMethods that were unsuccessfully in generating a TS guess:\n' | ||
| for unsuccessful_method in self.unsuccessful_methods: | ||
| self.ts_report += unsuccessful_method + ',' | ||
| unique_unsuccessful_methods = list(dict.fromkeys(self.unsuccessful_methods)) | ||
| self.ts_report += ','.join(unique_unsuccessful_methods) |
Copilot
AI
Jan 26, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The deduplication of TS methods in the report will cause an existing test to fail. The test in arc/species/species_test.py at line 1202-1206 expects the ts_report to contain duplicates (e.g., 'autotst,autotst,autotst,autotst,gcn,gcn,...'), but after this change, the report will show 'autotst,gcn,kinbot' instead. The test should be updated to match the new expected behavior.
Ensures restart output state is internally consistent, particularly addressing cases where convergence is indicated without corresponding output paths for TS calculations. This prevents issues arising from inconsistent data by resetting the convergence status and job types when required output paths are missing. The test suite adds a set to track created projects and adds a run_opt_job mock in the "sanitize TS output" test.
Removes the mock for `delete_all_species_jobs` within the `switch_ts` test. This mock is no longer required, simplifying the test setup.
Updates the scheduler to handle inconsistent transition state output upon restart, ensuring convergence is correctly assessed.
Also, deduplicates transition state guess methods in the report and addresses potential issues arising from missing output paths after a restart.