Kokoro::stream hangs on non-EOS-terminated buffer; streamStop(false) never returns

## Summary

`Kokoro::stream` hangs the streaming worker when the input buffer holds content that doesn't end in an end-of-sentence character. The buffer never drains, `streamStop(false)` waits forever, and only `streamStop(true)` recovers.

## Repro

Build the speech app from branch [`@ms/tts-stress-tests`](https://github.com/software-mansion/react-native-executorch/tree/@ms/tts-stress-tests) (preset chips reproduce this in one tap; see `no-term:a` / `no-term:long`). Or directly:

```ts
model.streamInsert('a');
const p = model.stream({ text: '', ... });
await new Promise((r) => setTimeout(r, 6000));
model.streamStop(false);  // never returns
await p;
```

Same hang with `'hello world'`, 2000× U+200D, or any content that ends without `.?!;`.

## Root cause

[`Kokoro.cpp:171-189`](https://github.com/software-mansion/react-native-executorch/blob/%40is/multilingual-tts/packages/react-native-executorch/common/rnexecutorch/models/text_to_speech/kokoro/Kokoro.cpp#L171-L189):

```cpp
size_t chunkSize = (eosIt != inputTextBuffer_.rend())
                       ? std::distance(eosIt, inputTextBuffer_.rend())
                       : 0;

if (chunkSize > 0 ||
    streamSkippedIterations >= params::kStreamMaxSkippedIterations) {
  input = inputTextBuffer_.substr(0, chunkSize);   // chunkSize still 0
  inputTextBuffer_.erase(0, chunkSize);            // erases nothing
  streamSkippedIterations = 0;                     // reset, loop forever
}
```

When no EOS exists in the buffer, `chunkSize = 0`. The force-flush threshold (`streamSkippedIterations >= kStreamMaxSkippedIterations`) fires correctly, but the extraction still uses `chunkSize` and pulls zero characters. Counter resets, loop continues, buffer never drains.

## Why the obvious fix is unsafe

Switching the force-flush branch to use the searchable window length:

```cpp
} else if (streamSkippedIterations >= params::kStreamMaxSkippedIterations) {
  input = inputTextBuffer_.substr(0, searchLimit);
  inputTextBuffer_.erase(0, searchLimit);
  streamSkippedIterations = 0;
}
```

…breaks the **LLM streaming** mode. `kStreamMaxSkippedIterations` is wall-clock-divided-by-`kStreamPause`, so on slow LLM token rates it trips before the LLM has even produced a sentence, forcing mid-word flushes and degrading speech quality. Tuning the threshold to satisfy both regimes is brittle — context noted in [#1134 comment](https://github.com/software-mansion/react-native-executorch/pull/1134#issuecomment-4491389865).

## Why this isn't just a theoretical concern

The current JS hook (`useTextToSpeech.ts:108-111`) hides this by auto-appending `.` when calling `stream({ text })`. Two places that bypass the rescue:

- Callers using `streamInsert` directly (incremental LLM tokens, dictation, partial captions).
- A streaming caller whose final chunk doesn't end with EOS (LLM truncated, network error, user pressed stop mid-sentence). The trailing un-terminated suffix sits in the buffer permanently.

In both cases `streamStop(false)` blocks forever with no diagnostic.

## Options that don't break LLM streaming

Each is a smaller change than re-tuning the skip counter:

1. **Caller-declared flush intent.** `streamInsert(text, { canFlush?: boolean })` — LLM mode passes `false`, normal apps pass `true`. Caller knows its own pacing.
2. **Explicit `streamFlush()` API.** Caller signals "I'm done feeding for now, partition what's left." LLM mode never calls it; normal apps call it before `streamStop(false)`. No threshold tuning.
3. **Wall-clock idle timeout.** Track time since last `streamInsert`. If the buffer has content and N seconds passed without new inserts, flush. LLM streams keep inserting fast enough to never trip it.

## Mitigations if a proper fix is out of scope

- **Document the hazard** in `streamInsert` / `streamStop` JSDoc — currently nothing in the API surface hints that `streamStop(false)` can hang indefinitely.
- **JS-side safety net** in the hook's `stream()` wrapper: time `streamStop(false)`, fall back to `streamStop(true)` with `console.warn` after some threshold. Zero cost to LLM streaming; prevents downstream apps from soft-locking.

## References

- PR #1134 review comment with full stress-test findings: https://github.com/software-mansion/react-native-executorch/pull/1134#issuecomment-4491389865
- Stress-test app branch: https://github.com/software-mansion/react-native-executorch/tree/@ms/tts-stress-tests


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kokoro::stream hangs on non-EOS-terminated buffer; streamStop(false) never returns #1153

Summary

Repro

Root cause

Why the obvious fix is unsafe

Why this isn't just a theoretical concern

Options that don't break LLM streaming

Mitigations if a proper fix is out of scope

References

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Kokoro::stream hangs on non-EOS-terminated buffer; streamStop(false) never returns #1153

Description

Summary

Repro

Root cause

Why the obvious fix is unsafe

Why this isn't just a theoretical concern

Options that don't break LLM streaming

Mitigations if a proper fix is out of scope

References

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions