Skip to content

Commit 01e0a2d

Browse files
unamedkrclaude
andauthored
fix(wasm): remove prefill sleep — restores ASYNCIFY token streaming (#33)
The emscripten_sleep(0) added to quant.h's prefill loop (PR #30) broke ASYNCIFY for the entire quant_generate call. The call stack during tq_forward() is too deep (matmul → SIMD kernels) for ASYNCIFY to unwind/rewind — it silently fails and the generation callback's sleep stops working too. Fix: remove prefill sleep entirely. The prefill blocks the browser for a few seconds (unavoidable without a step-by-step API), but "Thinking..." is shown before via requestAnimationFrame. Token streaming during generation works again. Also: pthreads removed (PR #32) to avoid pthreads+ASYNCIFY conflict, build.sh now uses single-thread SIMD + ASYNCIFY only. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 3df9a49 commit 01e0a2d

File tree

3 files changed

+6
-7
lines changed

3 files changed

+6
-7
lines changed

quant.h

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -15462,14 +15462,13 @@ int tq_generate(tq_model_t* model, tq_tokenizer_t* tokenizer,
1546215462
}
1546315463

1546415464
/* Prefill: process all prompt tokens.
15465-
* On Emscripten with ASYNCIFY, yield every 2 tokens so the browser
15466-
* can repaint (shows "Thinking..." and avoids "page unresponsive"). */
15465+
* NOTE: No emscripten_sleep() here — the call stack during tq_forward()
15466+
* is too deep for ASYNCIFY to unwind (matmul → SIMD kernels). Adding
15467+
* sleep here breaks ASYNCIFY for the entire generate call, including
15468+
* the token streaming callback. The browser shows "Thinking..." via
15469+
* requestAnimationFrame before entering this blocking prefill. */
1546715470
for (int i = 0; i < n_prompt; i++) {
1546815471
tq_forward(model, state, prompt_tokens[i], i);
15469-
#ifdef __EMSCRIPTEN__
15470-
extern void emscripten_sleep(unsigned int ms);
15471-
if (i % 2 == 1) emscripten_sleep(0);
15472-
#endif
1547315472
}
1547415473

1547515474
/* Repetition penalty setup */

wasm/quant.js

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

wasm/quant.wasm

-48 Bytes
Binary file not shown.

0 commit comments

Comments
 (0)