Add control commands to track jobs in CLI#72
Conversation
Vismayak
commented
Mar 2, 2026
- Updated README.md to include new job control commands for listing, checking status, viewing logs, and canceling jobs.
- Enhanced CLI functionality to track jobs in a local registry, integrating with Slurm for real-time job state updates.
- Introduced job name generation for SLURM jobs to improve identification and tracking.
- Updated batch script creation to use dynamic job names based on model identifiers.
- Updated README.md to include new job control commands for listing, checking status, viewing logs, and canceling jobs. - Enhanced CLI functionality to track jobs in a local registry, integrating with Slurm for real-time job state updates. - Introduced job name generation for SLURM jobs to improve identification and tracking. - Updated batch script creation to use dynamic job names based on model identifiers.
|
Having trouble running more extensive tests because of large queue in CampusCluster but initial tests are promising. I think too much data is shown in LLMFlux status, will need to remove some of the unnecessary data. We also have the job status |
|
Note - Now that we are able to get a more accurate elapsed time using SLURM job details, we should remove the inefficient Will create a seperate issue for this. We could possibly run the benchmark task as a background process that polls the status of the job and creates benchmark file with the accurate elapsed time when job is completed |
- Updated CLI to use new job detail retrieval methods, replacing deprecated functions for active and historical jobs. - Enhanced job state extraction logic to accommodate different Slurm JSON schemas. - Added helper functions for formatting timestamps and job time limits. - Improved test coverage for job commands and state extraction. - Updated documentation strings for clarity and consistency.
- Changed the CLI to use job details from the `jobs` dictionary instead of `slurm_data`. - Modified `get_active_job_details` to utilize `sacct` for fetching job states, improving accuracy for running and pending jobs. - Simplified the `get_job_details` function to directly query accounting history, removing the fallback to the queue. - Enhanced unit tests for job details retrieval, ensuring only relevant job IDs are returned.
|
Interesting idea. How do I use this? example command? |
|
Hi @joshfactorial, you can run it with the following instructions in the README 😄 |
Ralhazmy1
left a comment
There was a problem hiding this comment.
Tested jobs commands worked as expected
|
Everything worked for me! |