You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- Replace greedy decoding with Qwen3 model card sampling params (temp=0.6/top_p=0.95 for thinking)
- Filter thinking content in ai-worker via skip_special_tokens:false + state machine
- Increase thinking model token limit from 1024 to 4096
- Add I'll/I'm contraction patterns + trailing cleanup to cleanThinkingArtifacts
- Search results shown in collapsible thinking block before AI response
- Move changelog to changelogs/ directory
Copy file name to clipboardExpand all lines: ai-worker-common.js
+3-3Lines changed: 3 additions & 3 deletions
Original file line number
Diff line number
Diff line change
@@ -37,7 +37,7 @@ const SYSTEM_PROMPTS = {
37
37
autocomplete:
38
38
'You are a helpful writing assistant. Continue writing the text naturally. Only output the continuation, do not repeat the existing text. Write 1-2 sentences.',
39
39
generate:
40
-
'You are a helpful content generation assistant. Generate content based on the user\'s request. Output in well-formatted markdown.',
40
+
'You are a helpful content generation assistant. Generate content based on the user\'s request. Output in well-formatted markdown. Do NOT use LaTeX $...$ or $$...$$ notation for math — use plain text or Unicode instead (e.g. write "x²" not "$x^2$"). Do NOT include any internal thinking, reasoning process, mental notes, or meta-commentary. Output ONLY the final answer.',
41
41
markdown:
42
42
'You are a markdown expert. Generate well-formatted markdown content based on the user\'s request. Use headings, lists, tables, code blocks, and other markdown features as appropriate.',
43
43
explain:
@@ -52,8 +52,8 @@ const SYSTEM_PROMPTS = {
52
52
'You are a helpful writing assistant. Elaborate on the following text by adding more details, examples, and explanations to make it more comprehensive. Output in markdown format.',
53
53
shorten:
54
54
'You are a concise writing editor. Shorten the following text while preserving all key information. Remove redundancy and use fewer words. Only output the shortened text.',
55
-
qa: 'You are a helpful assistant. Answer the user\'s question based on the provided document context. Be concise. If the answer cannot be found in the context, say so.',
56
-
chat: 'You are a helpful AI assistant integrated into a Markdown editor. Help the user with writing, editing, and formatting tasks. Be concise. Output in markdown format.',
55
+
qa: 'You are a helpful assistant. The user may have document context open in their editor. If the question relates to the provided context, use it to answer. If the question is unrelated to the context, answer directly from your knowledge. Be concise. Do NOT use LaTeX $...$ or $$...$$ notation — use plain text or Unicode for math. Do NOT include any internal reasoning, thinking process, or meta-commentary. Output in markdown format.',
56
+
chat: 'You are a helpful AI assistant integrated into a Markdown editor. Help the user with writing, editing, and formatting tasks. Be concise. Output in markdown format. Do NOT use LaTeX $...$ or $$...$$ notation for math — use plain text or Unicode instead. Do NOT include any internal thinking, reasoning steps, drafting notes, or meta-commentary. Output ONLY the final polished answer.',
Copy file name to clipboardExpand all lines: ai-worker-gemini.js
+3-3Lines changed: 3 additions & 3 deletions
Original file line number
Diff line number
Diff line change
@@ -146,16 +146,16 @@ function buildMessages(taskType, context, userPrompt) {
146
146
rephrase: 'You are a helpful writing assistant. Rephrase the following text to improve clarity and readability while preserving the meaning. Output in markdown format.',
147
147
grammar: 'You are a helpful writing assistant. Fix any grammar, spelling, and punctuation errors in the following text. Only output the corrected text, nothing else.',
148
148
autocomplete: 'You are a helpful writing assistant. Continue writing the text naturally. Only output the continuation, do not repeat the existing text. Write 1-2 sentences.',
149
-
generate: 'You are a helpful content generation assistant. Generate content based on the user\'s request. Output in well-formatted markdown.',
149
+
generate: 'You are a helpful content generation assistant. Generate content based on the user\'s request. Output in well-formatted markdown. Do NOT use LaTeX $...$ or $$...$$ notation for math — use plain text or Unicode instead (e.g. write "x²" not "$x^2$"). Do NOT include any internal thinking, reasoning process, mental notes, or meta-commentary. Output ONLY the final answer.',
150
150
markdown: 'You are a markdown expert. Generate well-formatted markdown content based on the user\'s request. Use headings, lists, tables, code blocks, and other markdown features as appropriate.',
151
151
explain: 'You are a helpful assistant. Explain the following text in simple, easy-to-understand terms. Be concise. Output in markdown format.',
152
152
simplify: 'You are a helpful writing assistant. Simplify the following text to make it easier to understand. Use shorter sentences and simpler words. Output in markdown format.',
153
153
polish: 'You are a skilled writing editor. Polish the following text to improve flow, word choice, and overall quality while preserving the meaning and tone. Only output the polished text.',
154
154
formalize: 'You are a professional writing assistant. Rewrite the following text in a more formal, professional tone suitable for business or academic contexts. Only output the formalized text.',
155
155
elaborate: 'You are a helpful writing assistant. Elaborate on the following text by adding more details, examples, and explanations to make it more comprehensive. Output in markdown format.',
156
156
shorten: 'You are a concise writing editor. Shorten the following text while preserving all key information. Remove redundancy and use fewer words. Only output the shortened text.',
157
-
qa: 'You are a helpful assistant. Answer the user\'s question based on the provided document context. Be concise. If the answer cannot be found in the context, say so.',
158
-
chat: 'You are a helpful AI assistant integrated into a Markdown editor. Help the user with writing, editing, and formatting tasks. Be concise. Output in markdown format.',
157
+
qa: 'You are a helpful assistant. The user may have document context open in their editor. If the question relates to the provided context, use it to answer. If the question is unrelated to the context, answer directly from your knowledge. Be concise. Do NOT use LaTeX $...$ or $$...$$ notation — use plain text or Unicode for math. Do NOT include any internal reasoning, thinking process, or meta-commentary. Output in markdown format.',
158
+
chat: 'You are a helpful AI assistant integrated into a Markdown editor. Help the user with writing, editing, and formatting tasks. Be concise. Output in markdown format. Do NOT use LaTeX $...$ or $$...$$ notation for math — use plain text or Unicode instead. Do NOT include any internal thinking, reasoning steps, drafting notes, or meta-commentary. Output ONLY the final polished answer.',
Copy file name to clipboardExpand all lines: changelogs/CHANGELOG-search-thinking-block.md
+19-2Lines changed: 19 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -38,12 +38,29 @@ Refactors the AI chat search flow to show web search results in a collapsible "t
38
38
**What:** Added `.ai-thinking-block` container with green-accented border and fade-in animation, `.ai-thinking-spin` rotation keyframe for the search spinner, `.ai-thinking-searching` for the loading state, and `.ai-thinking-no-results` for the empty state with amber info icon. Dark mode variants included.
39
39
**Impact:** Consistent, polished visual treatment matching the existing AI panel design.
40
40
41
+
## 5. Qwen3 Thinking Model — Correct Sampling Parameters
42
+
**Files:**`ai-worker.js`
43
+
**What:** Replaced greedy decoding (`do_sample: false`) with sampling using Qwen3 model card recommended parameters: `temperature=0.6, top_p=0.95, top_k=20` for thinking mode and `temperature=0.7, top_p=0.8, top_k=20` for non-thinking mode. Greedy decoding causes "performance degradation and endless repetitions" per Qwen3 docs. Increased max tokens from 1024 to 4096 for thinking mode.
44
+
**Impact:** Thinking model no longer gets stuck in infinite thinking loop and actually produces the answer.
45
+
46
+
## 6. Thinking Content Filter — Worker-level `<think>` Tag Stripping
47
+
**Files:**`ai-worker.js`
48
+
**What:** When `enableThinking` is true, set `skip_special_tokens: false` so `<think>`/`</think>` markers remain visible in the TextStreamer callback. Added state machine that buffers thinking tokens and only forwards content after `</think>`. Strips leftover special tokens (`<|im_start|>`, etc.) from forwarded content. Applied to both text-only and vision generation paths.
49
+
**Impact:** Raw thinking content (planning bullets, reasoning monologue) no longer leaks into the chat response.
**What:** Added `I'll/I'm/I've/I'd` contraction patterns to reasoning detector (previously only matched `I 'll` with a space). Added trailing cleanup that strips planning outlines (`1. What the Black-Scholes equation is...`) and bare numbered items (`4.`) from end of responses.
54
+
**Impact:** Catches residual reasoning that appears after `</think>` in the model's actual response content.
0 commit comments