Forkserver workers leak as orphan processes (~5 GB per cce index) when cce serve is also running

### What happened?

I was hitting WSL out-of-memory crashes during long Claude Code sessions with CCE active. Twice the whole VM went down and I had to `pkill cce` after restart to keep it from happening again immediately. After the last one I dug in with Claude Code as a debug pair to figure out what was going on, and it turned out to be reproducible with a pretty small setup.

The headline: every `cce index` invocation while `cce serve` is also running for the same project leaves behind ~5 GB of worker processes that never exit. They become orphans (reparented to init or whatever's left of my shell) and just sit there holding memory until I manually `kill -9` them.

One clean repro I captured on my slop-clicker project (SvelteKit + Supabase, ~25k chunks indexed):

```
1. cce serve --project-dir /path/to/slop-clicker   (PID 213019, ~313 MB idle)
2. another shell: cce index --path README.md
   → "Indexed 0 chunks from 0 files", exits in <2s
3. cce serve's process tree now has:
     213019 cce serve
       ├── 214862  multiprocessing.resource_tracker  (15 MB)
       ├── 214863  multiprocessing.forkserver        (16 MB)
             ├── 214864  worker  (1570 MB, state=R)
             ├── 214865  worker  (843 MB,  state=R)
             ├── 214866  worker  (1570 MB, state=R)
             └── 214867  worker  (1570 MB, state=R)
4. kill -KILL 213019
5. 3 seconds later — workers still alive:
     resource_tracker  PPID=1556 (reparented)  RSS=15 MB   state=S
     forkserver        PPID=1556 (reparented)  RSS=17 MB   state=S
     worker  214864    still ~1.6 GB RSS, transitioned R → S
     worker  214866    still ~1.6 GB RSS, transitioned R → S
     worker  214867    still ~1.6 GB RSS, transitioned R → S
     (one worker exited normally, three didn't)
6. 10s later: still alive, still holding RSS, doing nothing
```

That's 5.2 GB leaked from one tiny `cce index --path` invocation. Across a long planning session with multiple commits / indexes / file edits, this compounds. I'm on a 12 GB WSL cap so it didn't take much before things got tight.

Important detail: `cce index` alone (no concurrent `cce serve`) doesn't leak. I ran `cce index --path README.md` three times in a row in a no-serve environment and ended with 0 leftover forkserver processes. The leak only happens when there's a sibling `cce serve` whose `_reindex_worker` got triggered by `cce index`'s file-open events. The serve-spawned workers continue running for the queued reindex backlog, then idle, then never exit.

A few related things I noticed while digging:

**Watcher reacts to read-only inotify events.** During the `cce index --path README.md` above, watchdog fires 618 `opened` events plus 618 `closed_no_write` events for files cce index opens to hash. Zero `modified` / `created` / `deleted`. CCE's watcher (`indexer/watcher.py:44`) uses `on_any_event` without filtering by type, so all of them flow into `_reindex_pending` and the reindex worker tries to process each one. For unchanged files `run_indexing` skips embedding via the hash check (I verified this — 100 plain `cat ... > /dev/null` reads only grew cce serve RSS by 3 MB and spawned zero workers). But when cce index is the source of the events, somewhere in `run_indexing`'s path the embedder.embed call does fire even with 0 changed chunks, and that's what spawns the pool above. I didn't trace which exact line — just empirically saw it happen.

**No way to make embedding single-process on Linux.** `_resolve_parallel` in `indexer/embedder.py:33` returns `min(cpu_count, 4)` and `CCE_EMBED_PARALLEL` has a `max(1, int(v))` floor:

```
CCE_EMBED_PARALLEL=unset    → 4
CCE_EMBED_PARALLEL=0        → 1   (still multiprocess)
CCE_EMBED_PARALLEL=1        → 1
CCE_EMBED_PARALLEL=4        → 4
CCE_EMBED_PARALLEL=8        → 8   (no upper cap)
CCE_EMBED_PARALLEL=none     → 4   (string silently ignored)
CCE_EMBED_PARALLEL=off      → 4
CCE_EMBED_PARALLEL=false    → 4
```

So on a 12-CPU host every cce serve that ends up embedding spawns 4 workers at ~1.6 GB each. On darwin/win32 the default is `None` (single-process, no fanout) — no equivalent path for Linux/WSL even when you'd want it.

**SIGINT and SIGQUIT are ignored by `cce serve`.** Only SIGTERM, SIGHUP, SIGUSR1, SIGUSR2 and stdin EOF cause exit. Probably the asyncio loop swallowing SIGINT without re-raising. Not a memory bug but it bit me when I was trying to clean up orphans with `kill -2`.

### What did you expect?

When `cce serve` or its forkserver pool shuts down, the multiprocessing children should clean up with it. I'd expect a `try ... finally: pool.close(); pool.terminate(); pool.join()` around the embed call, or a shutdown handler in `_run_serve` that propagates to the worker pool before exiting.

For the related items:
- Watcher should filter event types — `on_modified`/`on_created`/`on_deleted`/`on_moved` only, not `on_any_event`.
- `CCE_EMBED_PARALLEL=0` (or `none`/`off`) should map to the same single-process path darwin/win32 get for free. And probably cap the upper bound at `cpu_count` so users can't accidentally over-spawn.
- SIGINT could be wired up to the same shutdown handler SIGTERM uses.

### Steps to reproduce

Pre-stage the fastembed model so the test isn't subject to download issues (separate problem, filing separately):

```bash
export FASTEMBED_CACHE_PATH=$HOME/.cache/fastembed
SNAP="$FASTEMBED_CACHE_PATH/models--qdrant--bge-small-en-v1.5-onnx-q/snapshots/52398278842ec682c6f32300af41344b1c0b0bb2"
mkdir -p "$SNAP" && cd "$SNAP"
for f in config.json tokenizer.json tokenizer_config.json special_tokens_map.json model_optimized.onnx; do
  curl -sL -o "$f" "https://huggingface.co/qdrant/bge-small-en-v1.5-onnx-q/resolve/main/$f"
done
```

Then on any indexed project (I tested on slop-clicker, 25k chunks — anything non-trivial should repro):

```bash
# terminal 1
FASTEMBED_CACHE_PATH=$HOME/.cache/fastembed cce serve --project-dir /path/to/proj &
# wait for "CCE ready ..." in stderr

# terminal 2
FASTEMBED_CACHE_PATH=$HOME/.cache/fastembed cce index --path README.md
# completes in ~2s, "Indexed 0 chunks from 0 files"

# terminal 3 — inspect cce serve's process tree
SERVE_PID=$(pgrep -f 'python.*cce serve' | head -1)
pstree -p $SERVE_PID
# you'll see the forkserver supervisor + 4 workers, ~1.6 GB each

# kill cce serve and check
kill -KILL $SERVE_PID
sleep 3
ps aux | grep -E 'multiprocessing\.(forkserver|resource_tracker)' | grep -v grep
# the workers are still there, now orphans. SIGKILL to clean up.
```

### Relevant logs or error output

Process snapshots and timelines from the investigation:
https://gist.github.com/AZagatti/7393f669a0fd785d7153e07a52a11127

Most relevant for this bug:
- `04-index-process-poll.txt` — 60 ticks of process state during a full `cce index`, 12 distinct PIDs, ~8.7 GB combined RSS at peak
- `06-idle-vs-index-poll.txt` — cce serve idle for 90s (stable, no children), then sibling `cce index` ran and serve spawned its own forkserver pool
- `07-watcher-event-types.log` — the 1236 inotify events from `cce index --path README.md`, 99% read-only

### Python version

3.13.5

### OS

Ubuntu 24.04 LTS on WSL2 (kernel `6.6.87.2-microsoft-standard-WSL2`)

### CCE version

0.4.19


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Forkserver workers leak as orphan processes (~5 GB per cce index) when cce serve is also running #66

What happened?

What did you expect?

Steps to reproduce

Relevant logs or error output

Python version

OS

CCE version

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Forkserver workers leak as orphan processes (~5 GB per cce index) when cce serve is also running #66

Description

What happened?

What did you expect?

Steps to reproduce

Relevant logs or error output

Python version

OS

CCE version

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions