What happened?
I was hitting WSL out-of-memory crashes during long Claude Code sessions with CCE active. Twice the whole VM went down and I had to pkill cce after restart to keep it from happening again immediately. After the last one I dug in with Claude Code as a debug pair to figure out what was going on, and it turned out to be reproducible with a pretty small setup.
The headline: every cce index invocation while cce serve is also running for the same project leaves behind ~5 GB of worker processes that never exit. They become orphans (reparented to init or whatever's left of my shell) and just sit there holding memory until I manually kill -9 them.
One clean repro I captured on my slop-clicker project (SvelteKit + Supabase, ~25k chunks indexed):
1. cce serve --project-dir /path/to/slop-clicker (PID 213019, ~313 MB idle)
2. another shell: cce index --path README.md
→ "Indexed 0 chunks from 0 files", exits in <2s
3. cce serve's process tree now has:
213019 cce serve
├── 214862 multiprocessing.resource_tracker (15 MB)
├── 214863 multiprocessing.forkserver (16 MB)
├── 214864 worker (1570 MB, state=R)
├── 214865 worker (843 MB, state=R)
├── 214866 worker (1570 MB, state=R)
└── 214867 worker (1570 MB, state=R)
4. kill -KILL 213019
5. 3 seconds later — workers still alive:
resource_tracker PPID=1556 (reparented) RSS=15 MB state=S
forkserver PPID=1556 (reparented) RSS=17 MB state=S
worker 214864 still ~1.6 GB RSS, transitioned R → S
worker 214866 still ~1.6 GB RSS, transitioned R → S
worker 214867 still ~1.6 GB RSS, transitioned R → S
(one worker exited normally, three didn't)
6. 10s later: still alive, still holding RSS, doing nothing
That's 5.2 GB leaked from one tiny cce index --path invocation. Across a long planning session with multiple commits / indexes / file edits, this compounds. I'm on a 12 GB WSL cap so it didn't take much before things got tight.
Important detail: cce index alone (no concurrent cce serve) doesn't leak. I ran cce index --path README.md three times in a row in a no-serve environment and ended with 0 leftover forkserver processes. The leak only happens when there's a sibling cce serve whose _reindex_worker got triggered by cce index's file-open events. The serve-spawned workers continue running for the queued reindex backlog, then idle, then never exit.
A few related things I noticed while digging:
Watcher reacts to read-only inotify events. During the cce index --path README.md above, watchdog fires 618 opened events plus 618 closed_no_write events for files cce index opens to hash. Zero modified / created / deleted. CCE's watcher (indexer/watcher.py:44) uses on_any_event without filtering by type, so all of them flow into _reindex_pending and the reindex worker tries to process each one. For unchanged files run_indexing skips embedding via the hash check (I verified this — 100 plain cat ... > /dev/null reads only grew cce serve RSS by 3 MB and spawned zero workers). But when cce index is the source of the events, somewhere in run_indexing's path the embedder.embed call does fire even with 0 changed chunks, and that's what spawns the pool above. I didn't trace which exact line — just empirically saw it happen.
No way to make embedding single-process on Linux. _resolve_parallel in indexer/embedder.py:33 returns min(cpu_count, 4) and CCE_EMBED_PARALLEL has a max(1, int(v)) floor:
CCE_EMBED_PARALLEL=unset → 4
CCE_EMBED_PARALLEL=0 → 1 (still multiprocess)
CCE_EMBED_PARALLEL=1 → 1
CCE_EMBED_PARALLEL=4 → 4
CCE_EMBED_PARALLEL=8 → 8 (no upper cap)
CCE_EMBED_PARALLEL=none → 4 (string silently ignored)
CCE_EMBED_PARALLEL=off → 4
CCE_EMBED_PARALLEL=false → 4
So on a 12-CPU host every cce serve that ends up embedding spawns 4 workers at ~1.6 GB each. On darwin/win32 the default is None (single-process, no fanout) — no equivalent path for Linux/WSL even when you'd want it.
SIGINT and SIGQUIT are ignored by cce serve. Only SIGTERM, SIGHUP, SIGUSR1, SIGUSR2 and stdin EOF cause exit. Probably the asyncio loop swallowing SIGINT without re-raising. Not a memory bug but it bit me when I was trying to clean up orphans with kill -2.
What did you expect?
When cce serve or its forkserver pool shuts down, the multiprocessing children should clean up with it. I'd expect a try ... finally: pool.close(); pool.terminate(); pool.join() around the embed call, or a shutdown handler in _run_serve that propagates to the worker pool before exiting.
For the related items:
- Watcher should filter event types —
on_modified/on_created/on_deleted/on_moved only, not on_any_event.
CCE_EMBED_PARALLEL=0 (or none/off) should map to the same single-process path darwin/win32 get for free. And probably cap the upper bound at cpu_count so users can't accidentally over-spawn.
- SIGINT could be wired up to the same shutdown handler SIGTERM uses.
Steps to reproduce
Pre-stage the fastembed model so the test isn't subject to download issues (separate problem, filing separately):
export FASTEMBED_CACHE_PATH=$HOME/.cache/fastembed
SNAP="$FASTEMBED_CACHE_PATH/models--qdrant--bge-small-en-v1.5-onnx-q/snapshots/52398278842ec682c6f32300af41344b1c0b0bb2"
mkdir -p "$SNAP" && cd "$SNAP"
for f in config.json tokenizer.json tokenizer_config.json special_tokens_map.json model_optimized.onnx; do
curl -sL -o "$f" "https://huggingface.co/qdrant/bge-small-en-v1.5-onnx-q/resolve/main/$f"
done
Then on any indexed project (I tested on slop-clicker, 25k chunks — anything non-trivial should repro):
# terminal 1
FASTEMBED_CACHE_PATH=$HOME/.cache/fastembed cce serve --project-dir /path/to/proj &
# wait for "CCE ready ..." in stderr
# terminal 2
FASTEMBED_CACHE_PATH=$HOME/.cache/fastembed cce index --path README.md
# completes in ~2s, "Indexed 0 chunks from 0 files"
# terminal 3 — inspect cce serve's process tree
SERVE_PID=$(pgrep -f 'python.*cce serve' | head -1)
pstree -p $SERVE_PID
# you'll see the forkserver supervisor + 4 workers, ~1.6 GB each
# kill cce serve and check
kill -KILL $SERVE_PID
sleep 3
ps aux | grep -E 'multiprocessing\.(forkserver|resource_tracker)' | grep -v grep
# the workers are still there, now orphans. SIGKILL to clean up.
Relevant logs or error output
Process snapshots and timelines from the investigation:
https://gist.github.com/AZagatti/7393f669a0fd785d7153e07a52a11127
Most relevant for this bug:
04-index-process-poll.txt — 60 ticks of process state during a full cce index, 12 distinct PIDs, ~8.7 GB combined RSS at peak
06-idle-vs-index-poll.txt — cce serve idle for 90s (stable, no children), then sibling cce index ran and serve spawned its own forkserver pool
07-watcher-event-types.log — the 1236 inotify events from cce index --path README.md, 99% read-only
Python version
3.13.5
OS
Ubuntu 24.04 LTS on WSL2 (kernel 6.6.87.2-microsoft-standard-WSL2)
CCE version
0.4.19
What happened?
I was hitting WSL out-of-memory crashes during long Claude Code sessions with CCE active. Twice the whole VM went down and I had to
pkill cceafter restart to keep it from happening again immediately. After the last one I dug in with Claude Code as a debug pair to figure out what was going on, and it turned out to be reproducible with a pretty small setup.The headline: every
cce indexinvocation whilecce serveis also running for the same project leaves behind ~5 GB of worker processes that never exit. They become orphans (reparented to init or whatever's left of my shell) and just sit there holding memory until I manuallykill -9them.One clean repro I captured on my slop-clicker project (SvelteKit + Supabase, ~25k chunks indexed):
That's 5.2 GB leaked from one tiny
cce index --pathinvocation. Across a long planning session with multiple commits / indexes / file edits, this compounds. I'm on a 12 GB WSL cap so it didn't take much before things got tight.Important detail:
cce indexalone (no concurrentcce serve) doesn't leak. I rancce index --path README.mdthree times in a row in a no-serve environment and ended with 0 leftover forkserver processes. The leak only happens when there's a siblingcce servewhose_reindex_workergot triggered bycce index's file-open events. The serve-spawned workers continue running for the queued reindex backlog, then idle, then never exit.A few related things I noticed while digging:
Watcher reacts to read-only inotify events. During the
cce index --path README.mdabove, watchdog fires 618openedevents plus 618closed_no_writeevents for files cce index opens to hash. Zeromodified/created/deleted. CCE's watcher (indexer/watcher.py:44) useson_any_eventwithout filtering by type, so all of them flow into_reindex_pendingand the reindex worker tries to process each one. For unchanged filesrun_indexingskips embedding via the hash check (I verified this — 100 plaincat ... > /dev/nullreads only grew cce serve RSS by 3 MB and spawned zero workers). But when cce index is the source of the events, somewhere inrun_indexing's path the embedder.embed call does fire even with 0 changed chunks, and that's what spawns the pool above. I didn't trace which exact line — just empirically saw it happen.No way to make embedding single-process on Linux.
_resolve_parallelinindexer/embedder.py:33returnsmin(cpu_count, 4)andCCE_EMBED_PARALLELhas amax(1, int(v))floor:So on a 12-CPU host every cce serve that ends up embedding spawns 4 workers at ~1.6 GB each. On darwin/win32 the default is
None(single-process, no fanout) — no equivalent path for Linux/WSL even when you'd want it.SIGINT and SIGQUIT are ignored by
cce serve. Only SIGTERM, SIGHUP, SIGUSR1, SIGUSR2 and stdin EOF cause exit. Probably the asyncio loop swallowing SIGINT without re-raising. Not a memory bug but it bit me when I was trying to clean up orphans withkill -2.What did you expect?
When
cce serveor its forkserver pool shuts down, the multiprocessing children should clean up with it. I'd expect atry ... finally: pool.close(); pool.terminate(); pool.join()around the embed call, or a shutdown handler in_run_servethat propagates to the worker pool before exiting.For the related items:
on_modified/on_created/on_deleted/on_movedonly, noton_any_event.CCE_EMBED_PARALLEL=0(ornone/off) should map to the same single-process path darwin/win32 get for free. And probably cap the upper bound atcpu_countso users can't accidentally over-spawn.Steps to reproduce
Pre-stage the fastembed model so the test isn't subject to download issues (separate problem, filing separately):
Then on any indexed project (I tested on slop-clicker, 25k chunks — anything non-trivial should repro):
Relevant logs or error output
Process snapshots and timelines from the investigation:
https://gist.github.com/AZagatti/7393f669a0fd785d7153e07a52a11127
Most relevant for this bug:
04-index-process-poll.txt— 60 ticks of process state during a fullcce index, 12 distinct PIDs, ~8.7 GB combined RSS at peak06-idle-vs-index-poll.txt— cce serve idle for 90s (stable, no children), then siblingcce indexran and serve spawned its own forkserver pool07-watcher-event-types.log— the 1236 inotify events fromcce index --path README.md, 99% read-onlyPython version
3.13.5
OS
Ubuntu 24.04 LTS on WSL2 (kernel
6.6.87.2-microsoft-standard-WSL2)CCE version
0.4.19