improvement: stream reasoning-content in <thinking> tags by JannikSt · Pull Request #593 · PrimeIntellect-ai/prime

JannikSt · 2026-05-03T00:29:57Z

Stream reasoning-content tokens (R1/o1-style models) inside <thinking> tags so the CLI doesn't appear to hang, and surface finish_reason at the end so length / content_filter truncations are visible.

Note

Low Risk
Low risk CLI-only output changes; main risk is minor formatting/streaming regressions when handling mixed reasoning_content and content chunks.

Overview
Improves prime inference chat output to surface model reasoning tokens by streaming reasoning_content inside <thinking> tags (both streaming and non-streaming responses) so the CLI shows progress even before final content.

Tracks finish_reason during generation and prints a warning when the response ends due to length without producing final content, prompting users to increase --max-tokens.

^{Reviewed by Cursor Bugbot for commit 1a5557d. Bugbot is set up for automated code reviews on this repo. Configure here.}

…h_reason The OpenAI-compat streaming response carries chain-of-thought tokens under delta.reasoning_content (R1/o1-style models). Without rendering, the user sees a long pause then the final answer. Wrap the reasoning chunks in <thinking>…</thinking> so the CLI mirrors the assistant's output shape. Also surface finish_reason at the end so 'length' or 'content_filter' truncations are visible.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1a5557d39f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-03T00:32:19Z

+        reasoning = msg.get("reasoning_content") or ""
+        finish_reason = choices[0].get("finish_reason")
+        if reasoning:
+            console.print(f"[dim]<thinking>\n{reasoning}\n</thinking>[/dim]")


Disable Rich markup when emitting reasoning_content

Printing reasoning with console.print(f"[dim]<thinking>\n{reasoning}\n</thinking>[/dim]") causes Rich to parse any [...] sequences inside model output as markup, which can drop or alter text (for example markdown links or bracketed code snippets). In non-streaming mode this corrupts the visible reasoning output and makes it unreliable for debugging; emit the text with markup=False (or write raw to stdout) so the response is preserved verbatim.

Useful? React with 👍 / 👎.

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 1a5557d. Configure here.}

cursor · 2026-05-03T00:35:25Z

+        reasoning = msg.get("reasoning_content") or ""
+        finish_reason = choices[0].get("finish_reason")
+        if reasoning:
+            console.print(f"[dim]<thinking>\n{reasoning}\n</thinking>[/dim]")


Rich markup injection corrupts model reasoning output

High Severity

The reasoning content from the model is interpolated directly into a Rich markup string via console.print(f"[dim]<thinking>\n{reasoning}\n</thinking>[/dim]"). Rich interprets square brackets as markup tags, so any brackets in the reasoning (array indices like [0], citations like [1], math like [a, b]) will be parsed as styling commands, causing text to silently disappear or produce rendering errors. The rest of the codebase uses markup=False when printing dynamic content (e.g., output_data_as_json).

^{Reviewed by Cursor Bugbot for commit 1a5557d. Configure here.}

chatgpt-codex-connector Bot reviewed May 3, 2026

View reviewed changes

cursor Bot reviewed May 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

improvement: stream reasoning-content in <thinking> tags#593

improvement: stream reasoning-content in <thinking> tags#593
JannikSt wants to merge 1 commit into
mainfrom
improvement/inference-reasoning-content-streaming

JannikSt commented May 3, 2026 •

edited by cursor Bot

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 3, 2026

Uh oh!

cursor Bot left a comment

Uh oh!

cursor Bot May 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

JannikSt commented May 3, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 3, 2026

Choose a reason for hiding this comment

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 3, 2026

Choose a reason for hiding this comment

Rich markup injection corrupts model reasoning output

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

JannikSt commented May 3, 2026 •

edited by cursor Bot

Loading