Skip to content

Cleanup#13

Merged
mikeheyns merged 13 commits intomainfrom
cleanup
Mar 19, 2026
Merged

Cleanup#13
mikeheyns merged 13 commits intomainfrom
cleanup

Conversation

@will-fawcett-trillium
Copy link
Copy Markdown

Code cleanup and hygiene improvements:

  • Remove ~2,100 lines of dead code: delete old/deprecated Streamlit app versions (app_old_1.py, app_old_2.py,
    supermag_web_deployment_old_.py, data_sources_old_.py) and unused generate_data.py
  • Replace hardcoded paths with argparse in SHEATH training and inference scripts (sheath_inference.py, sheath_train_best_model.py,
    sheath_train_embeddings.py, sheath_feature_vector*.py), making them runnable outside the original dev environment
  • Fix broken imports: consolidate duplicated utility functions into sheath2024 package (new init.py) and update all call sites
  • Improve documentation: expand README with project overview, architecture diagram, structure guide, and acknowledgements; tighten
    docstrings across scripts
  • Add public quickstart notebook (public/sheath_inference_quickstart.ipynb) for running SHEATH inference end-to-end
  • Clear notebook outputs and apply formatting for lint compliance
  • Bump minimum Python version from 3.10 to 3.11

  The training and inference scripts had data/scaler/checkpoint paths
  hardcoded to developer home directories on GCP VMs. These are now
  configurable via command-line arguments, with the original values
  retained as defaults. Also wraps both scripts in proper main()
  functions and removes commented-out dead code.
@mikeheyns mikeheyns merged commit 77fcaef into main Mar 19, 2026
2 checks passed
@will-fawcett-trillium will-fawcett-trillium deleted the cleanup branch March 19, 2026 17:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants