-
Notifications
You must be signed in to change notification settings - Fork 1
Improve efficiency of selection of training files for classification. #138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR aims to improve the efficiency of training file selection and XGBoost model training for angular reconstruction. The changes remove the non-binned XGB training workflow in favor of the binned approach, eliminate separate per-telescope-multiplicity models, and optimize the file selection process with better parsing and filtering.
Changes:
- Removed the unbinned XGB angular reconstruction training workflow (IRF.trainXGBforAngularReconstruction.sh deleted, related code removed from IRF.production.sh)
- Eliminated telescope multiplicity loop in favor of single unified model training, with increased training samples (5M vs 1M events) and configurable core usage (48 cores for training, 1 for inference)
- Optimized file selection script with array-based file handling, single-pass RUNINFO parsing, fail-fast filtering, pre-created directories, and verbose mode option
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| scripts/helper_scripts/submit_scripts_to_htcondor.sh | Commented out request_cpus parameter (no functional change) |
| scripts/helper_scripts/IRF.trainXGBforAngularReconstruction_sub.sh | Removed telescope multiplicity parameter, increased max_events to 5M, added MAXCORES=48 |
| scripts/helper_scripts/IRF.dispXGB_sub.sh | Updated comment for consistency, added MAXCORES=1 for model inference |
| scripts/helper_scripts/ANALYSIS.dispXGB_sub.sh | Updated comment to match IRF script, added blank line for formatting |
| scripts/IRF.trainXGBforAngularReconstructionBinned.sh | Removed telescope multiplicity loop, now trains single unified model |
| scripts/IRF.trainXGBforAngularReconstruction.sh | Deleted entire file (unbinned training workflow removed) |
| scripts/IRF.selectRunsForGammaHadronSeparationTraining.sh | Major refactoring for efficiency with array handling, optimized parsing, fail-fast filtering, and progress tracking |
| scripts/IRF.production.sh | Removed call to deleted unbinned XGB training script |
| scripts/ANALYSIS.dispXGB.sh | Renamed NRUNS to NFILES, improved documentation, added input validation |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| $(dirname "$0")/IRF.trainTMVAforAngularReconstruction.sh \ | ||
| $VX $ATM $ZA "$FIXEDWOBBLE" "$FIXEDNSB" 0 \ | ||
| $SIMTYPE $ANATYPE $UUID | ||
| else | ||
| # Explicitly remove 0.0 bin | ||
| FIXEDWOBBLE="0.25 0.5 0.75 1.0 1.25 1.5 1.75 2.0" | ||
| FIXEDNSB="160 200 250 350" | ||
| $(dirname "$0")/IRF.trainXGBforAngularReconstruction.sh \ | ||
| $VX $ATM $ZA "$FIXEDWOBBLE" "$FIXEDNSB" 0 \ | ||
| $SIMTYPE $ANATYPE $UUID | ||
| fi |
Copilot
AI
Jan 25, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The removal of this else block means the TRAINXGBANGRES IRF type is no longer handled in this script. However, line 407 (outside the changed region) still includes 'TRAINXGBANGRES' in the condition check. Users who specify IRFTYPE=TRAINXGBANGRES will enter the conditional block but no training will occur since only TRAINMVANGRES is handled. The TRAINXGBANGRES check should be removed from line 407, or this else block should be replaced with a call to the binned version (IRF.trainXGBforAngularReconstructionBinned.sh) if TRAINXGBANGRES is meant to be an alias for TRAINXGBANGRESBINNED.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
No description provided.