Hosted RL Entrypoint #256

manveerxyz · 2025-12-17T02:26:58Z

Note

Introduces hosted RL training to the Prime CLI with a new API client and full CLI workflow.

Adds prime_cli/api/rl.py with RLClient, RLModel, RLRun supporting model listing, run CRUD (create/list/stop/delete), and log retrieval
New prime rl command group: run (configurable with W&B/eval options), models, list, logs (with cleaned streaming), stop, delete, and init (generate TOML template)
Implements config utilities (utils/config.py) for TOML loading and CLI+TOML merging via BaseConfig; re-exported in utils/__init__.py
Registers rl in main.py and organizes commands into help panels (Account/Lab/Compute)
Minor UX tweaks: clearer help for eval run; rich link formatting in eval push output

^{Written by Cursor Bugbot for commit 84abaf0. This will update automatically on new commits. Configure here.}

packages/prime/src/prime_cli/utils/config.py

packages/prime/src/prime_cli/api/rl.py

packages/prime/src/prime_cli/commands/rl.py

packages/prime/src/prime_cli/api/rl.py

Usage: prime rl [OPTIONS] ENVIRONMENTS... | COMMAND [ARGS]... Manage RL training runs. By default, 'prime rl <environments>' runs 'prime rl run <environments>'. ╭─ Options ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮ │ --help -h Show this message and exit. │ ╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ ╭─ Commands ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮ │ run Create an RL training run with specified environments and model. │ │ models List available models for RL training. │ │ runs List your RL training runs. │ │ stop Stop an RL training run. │ │ delete Delete an RL training run. │ ╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ to start a run

…s handling in RLClient and related command.

* quick fix for prime rl list when no name set * remove truncation of id in prime rl list

* feat: add eval_config support to RL API client * Remove accidentally committed test files * feat: add logs command for RL runs * fix: move time import to top, add rl_config example * feat: add --watch flag and improve log streaming * fix: allow built-in envs like reverse-text, update example * feat: add --eval-* options to rl run command * fix: strip ANSI escape codes from logs output * fix: increase poll interval to 5s, add rate limit handling * fix: filter progress bars from logs output, remove redundant --watch flag * fix: keep 100% progress bar completion lines in logs * fix: address review comments - simplify log follow, warn on unused eval options * fix: handle log rotation in follow mode when tail window is full * fix: always use overlap detection for log follow to handle fast growth with rotation * feat: add [eval] section support in TOML config files * fix: improve progress bar filtering to remove empty lines * fix: require owner/name format for environments, remove example config * fix: use from_sources for eval config merging, require owner/name format - Use BaseConfig.from_sources for eval config precedence instead of manual if-statements - Require owner/name format for --eval-envs (same as training environments) - Rename EvalConfig.eval_base_model to base_model for proper underscore mapping

JannikSt · 2026-01-03T12:00:53Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f553271071

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

packages/prime/src/prime_cli/commands/rl.py

* custom image registry for sandboxes * prime images * --image typo * linux/amd64 * updated to not build locally * full image path * rm emojis * remove inline * image status * full image path * add cleanup * adjust scope output * bug bot stuff * validate_output_format * bug bot comment * update prime images list * limit platform * bump timeout * add closed beta info

* bump version to 0.5.8 * bump versions

* Update eval sample field. * Update docs.

Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: sami <sami@primeintellect.ai>

manveerxyz changed the title ~~WIP: Hosted RL Entrypoint~~ Hosted RL Entrypoint Dec 18, 2025

cursor bot reviewed Dec 18, 2025

View reviewed changes

packages/prime/src/prime_cli/utils/config.py Show resolved Hide resolved

packages/prime/src/prime_cli/api/rl.py Outdated Show resolved Hide resolved

cursor bot reviewed Dec 23, 2025

View reviewed changes

packages/prime/src/prime_cli/commands/rl.py Show resolved Hide resolved

manveerxyz force-pushed the feature/rft branch from 9537912 to 3f3af86 Compare December 23, 2025 04:28

cursor bot reviewed Dec 23, 2025

View reviewed changes

packages/prime/src/prime_cli/api/rl.py Show resolved Hide resolved

manveerxyz and others added 11 commits December 28, 2025 19:20

Implement commands for hosted RL

d4229a6

Hosted RL

65b8ad4

Support tomls on prime rl cmd

89079df

Minor fix

7e0b4e1

Cleanup references to RFT

deeb088

Minor improvements

63b2182

Fix ruff

a3e1cd9

Match post rft run schema to new backend

1dc8a75

Refactor delete_run method to remove return value and simplify succes…

2cdfe27

…s handling in RLClient and related command.

Fix/prime rl list (#267)

5ab66bd

* quick fix for prime rl list when no name set * remove truncation of id in prime rl list

manveerxyz force-pushed the feature/rft branch from 05d205a to 5ab66bd Compare December 29, 2025 20:33

Add support for run_config

084b563

JannikSt mentioned this pull request Dec 30, 2025

feat: add eval_config support to RL API client #271

Merged

chatgpt-codex-connector bot reviewed Jan 3, 2026

View reviewed changes

packages/prime/src/prime_cli/commands/rl.py Show resolved Hide resolved

cursor bot reviewed Jan 3, 2026

View reviewed changes

packages/prime/src/prime_cli/commands/rl.py Show resolved Hide resolved

kcoopermiller and others added 5 commits January 3, 2026 13:15

Chore/bump version 0.5.8 (#270)

27be637

* bump version to 0.5.8 * bump versions

Fix: Update eval sample field (#265)

894d04b

* Update eval sample field. * Update docs.

Fix: Remove trailing comma from API token URL (#273)

cdefef5

Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: sami <sami@primeintellect.ai>

resolve conflicts

84abaf0

JannikSt approved these changes Jan 3, 2026

View reviewed changes

JannikSt merged commit 75ea30f into main Jan 3, 2026
16 of 18 checks passed

JannikSt deleted the feature/rft branch January 3, 2026 12:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Hosted RL Entrypoint #256

Hosted RL Entrypoint #256

manveerxyz commented Dec 17, 2025 •

edited by cursor bot

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

JannikSt commented Jan 3, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Hosted RL Entrypoint #256

Hosted RL Entrypoint #256

Conversation

manveerxyz commented Dec 17, 2025 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

JannikSt commented Jan 3, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

manveerxyz commented Dec 17, 2025 •

edited by cursor bot

Loading