Skip to content

Conversation

@manveerxyz
Copy link
Member

@manveerxyz manveerxyz commented Dec 17, 2025

Note

Introduces hosted RL training to the Prime CLI with a new API client and full CLI workflow.

  • Adds prime_cli/api/rl.py with RLClient, RLModel, RLRun supporting model listing, run CRUD (create/list/stop/delete), and log retrieval
  • New prime rl command group: run (configurable with W&B/eval options), models, list, logs (with cleaned streaming), stop, delete, and init (generate TOML template)
  • Implements config utilities (utils/config.py) for TOML loading and CLI+TOML merging via BaseConfig; re-exported in utils/__init__.py
  • Registers rl in main.py and organizes commands into help panels (Account/Lab/Compute)
  • Minor UX tweaks: clearer help for eval run; rich link formatting in eval push output

Written by Cursor Bugbot for commit 84abaf0. This will update automatically on new commits. Configure here.

@manveerxyz manveerxyz changed the title WIP: Hosted RL Entrypoint Hosted RL Entrypoint Dec 18, 2025
manveerxyz and others added 11 commits December 28, 2025 19:20
 Usage: prime rl [OPTIONS] ENVIRONMENTS... | COMMAND [ARGS]...

 Manage RL training runs.

 By default, 'prime rl <environments>' runs 'prime rl run <environments>'.

╭─ Options ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --help  -h        Show this message and exit.                                                                                                                                                                      │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Commands ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ run      Create an RL training run with specified environments and model.                                                                                                                                          │
│ models   List available models for RL training.                                                                                                                                                                    │
│ runs     List your RL training runs.                                                                                                                                                                               │
│ stop     Stop an RL training run.                                                                                                                                                                                  │
│ delete   Delete an RL training run.                                                                                                                                                                                │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ to start a run
* quick fix for prime rl list when no name set

* remove truncation of id in prime rl list
* feat: add eval_config support to RL API client

* Remove accidentally committed test files

* feat: add logs command for RL runs

* fix: move time import to top, add rl_config example

* feat: add --watch flag and improve log streaming

* fix: allow built-in envs like reverse-text, update example

* feat: add --eval-* options to rl run command

* fix: strip ANSI escape codes from logs output

* fix: increase poll interval to 5s, add rate limit handling

* fix: filter progress bars from logs output, remove redundant --watch flag

* fix: keep 100% progress bar completion lines in logs

* fix: address review comments - simplify log follow, warn on unused eval options

* fix: handle log rotation in follow mode when tail window is full

* fix: always use overlap detection for log follow to handle fast growth with rotation

* feat: add [eval] section support in TOML config files

* fix: improve progress bar filtering to remove empty lines

* fix: require owner/name format for environments, remove example config

* fix: use from_sources for eval config merging, require owner/name format

- Use BaseConfig.from_sources for eval config precedence instead of manual if-statements
- Require owner/name format for --eval-envs (same as training environments)
- Rename EvalConfig.eval_base_model to base_model for proper underscore mapping
@JannikSt
Copy link
Member

JannikSt commented Jan 3, 2026

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f553271071

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

kcoopermiller and others added 5 commits January 3, 2026 13:15
* custom image registry for sandboxes

* prime images

* --image typo

* linux/amd64

* updated to not build locally

* full image path

* rm emojis

* remove inline

* image status

* full image path

* add cleanup

* adjust scope output

* bug bot stuff

* validate_output_format

* bug bot comment

* update prime images list

* limit platform

* bump timeout

* add closed beta info
* bump version to 0.5.8

* bump versions
* Update eval sample field.

* Update docs.
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: sami <sami@primeintellect.ai>
@JannikSt JannikSt merged commit 75ea30f into main Jan 3, 2026
16 of 18 checks passed
@JannikSt JannikSt deleted the feature/rft branch January 3, 2026 12:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants