Skip to content

Commit bbb9159

Browse files
unamedkrclaude
andcommitted
feat: disable Qwen3 thinking mode by default (/no_think)
Qwen3-4B defaults to thinking mode ("Okay, the user asked..."), wasting tokens on reasoning chains. Adding /no_think to the system prompt produces direct answers. Before: "Okay, the user asked... Let me recall... Gravity is a fu" After: "Gravity is the force that attracts any object with mass..." Speed: 4.3 tok/s (unchanged) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent e273f2b commit bbb9159

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

bench/rlv/stages/_llm.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -209,7 +209,7 @@ def stop_server():
209209
# reasoning chains in chat mode. Verified with the Acme test doc:
210210
# without this, the model picks the first entity (primacy bias);
211211
# with this, it correctly identifies the requested role.
212-
DEFAULT_SYSTEM_PROMPT = "Answer in one short sentence. No reasoning steps."
212+
DEFAULT_SYSTEM_PROMPT = "/no_think\nAnswer in one short sentence. No reasoning steps."
213213

214214

215215
MAX_LLM_RETRIES = 2 # retry once on transient server errors

0 commit comments

Comments
 (0)