Setup / first-run issues: broken post-commit hook, model download stalls, /tmp cache wipe, silent lazy index

### What happened?

Filing as a group because they're all in the install / first-run / setup path and I hit them in sequence while setting CCE up. Used Claude Code as a debug pair to dig into each one. Will split if you'd prefer, just felt spammy to file as five separate issues.

**1. `cce init` installs a `post-commit` hook that errors silently on every commit.**

The hook script written to `.git/hooks/post-commit` calls:

```sh
cce index --changed-only >/dev/null 2>&1 &
```

But in v0.4.19, `cce index` only accepts `--full` and `--path`. The `--changed-only` flag doesn't exist:

```
$ cce index --changed-only
Usage: cce index [OPTIONS]
Try 'cce index --help' for help.

Error: No such option: --changed-only
```

Output is `>/dev/null 2>&1` so the error is invisible. My commits looked like they were keeping the index up to date but nothing was happening. The flag is hardcoded in `src/context_engine/indexer/git_hooks.py:36` — confirmed against the v0.4.19 source on `main`.

I think `cce index` (no flag) already does incremental indexing of changed files, so just dropping `--changed-only` from the hook template should fix it.

**2. The model download via `huggingface_hub` has no timeout and stalls indefinitely.**

On a fresh WSL my first `cce serve` (which preloads the embedding model) hung for several minutes with 5 ESTABLISHED IPv6 connections to the HF CDN but zero bytes downloaded into the ONNX blob file. Same machine, same minute, `curl` of the same URL pulled the 66 MB ONNX in under 5 seconds:

```
$ curl -sL -o /tmp/m.onnx -w "size=%{size_download} time=%{time_total}\n" \
    https://huggingface.co/qdrant/bge-small-en-v1.5-onnx-q/resolve/main/model_optimized.onnx
size=66465124 time=4.736027
```

I had this happen on two different WSL boots in different network conditions, once over IPv4 (stuck in TCP `SYN-SENT`) and once over IPv6 (5 sockets ESTABLISHED but no bytes transferring). I think the underlying issue is just the lack of a timeout / retry budget in `TextEmbedding(resolved)` at `indexer/embedder.py:64`. `huggingface_hub` will happily wait forever.

Worse: when the download stalls partway, fastembed has already created the snapshot directory with the small config blobs, but `model_optimized.onnx` exists as a 0-byte `.incomplete` file. Every subsequent `cce serve` / `cce search` / `cce index` then crashes immediately on:

```
RuntimeError: Failed to load embedding model 'BAAI/bge-small-en-v1.5'. ...
Original error: [ONNXRuntimeError] : 3 : NO_SUCHFILE : Load model from
/tmp/fastembed_cache/.../snapshots/.../model_optimized.onnx failed:
Load model ... failed. File doesn't exist
```

The broken state is sticky until you manually `rm -rf` the cache.

**3. Default cache lives in `/tmp/fastembed_cache`, which WSL wipes on reboot.**

fastembed's default `cache_dir` is `Path(tempfile.gettempdir()) / "fastembed_cache"` → `/tmp/fastembed_cache` on Linux. CCE never overrides this — `TextEmbedding(resolved)` is called without a `cache_dir` arg.

On Ubuntu under WSL with `systemd=true` (the default), `/usr/lib/tmpfiles.d/tmp.conf` contains:

```
D /tmp 1777 root root 30d
```

The `D` directive empties `/tmp` recursively whenever `systemd-tmpfiles-setup.service` runs, which is on every boot. So even when the model download succeeds, the next WSL restart wipes it and the next CCE start has to redownload — which can hit issue #2 again.

Configuration.md doesn't mention `FASTEMBED_CACHE_PATH`. The fix would be passing `cache_dir=Path.home()/".cache"/"fastembed"` (or anywhere persistent) to `TextEmbedding`, plus documenting it.

**4. Hook script blocks 1-2 seconds per call when `serve.port` is stale.**

If `cce serve` ever dies but leaves its `serve.port` file behind in `~/.cce/projects/<basename>/`, every hook fires `curl -m 1` (or `-m 2` for SessionStart) to a port nothing is listening on, waits for the full timeout, and exits silently. I timed it with a real stale port file:

```
$ time ~/.cce/hooks/cce_hook.sh PostToolUse < /dev/null
real    1.020s
$ time ~/.cce/hooks/cce_hook.sh SessionStart < /dev/null
real    2.012s
```

That's 1-2s of dead wait per Claude Code hook event. Long sessions fire hundreds of `PostToolUse` and `UserPromptSubmit` hooks, so this accumulates pretty fast. A quick `bash -c "exec 3<>/dev/tcp/127.0.0.1/$PORT" 2>/dev/null` liveness probe before the curl would skip the wait. Or `cce serve` should remove its port file on shutdown.

**5. First `context_search` MCP call silently triggers a full project re-index if the index is empty.**

I verified this by talking to a fresh `cce serve` directly via MCP stdio. With an empty project (no `cce init`, 0 chunks in the vector store), sending `tools/call: context_search` enters `_ensure_indexed()` at `integration/mcp_server.py:886-903`, which silently calls `run_indexing(self._config, self._project_dir, full=False)`. The MCP search request blocks while indexing runs and only returns once it's done.

For a tiny project (3 files, ~9 chunks) this was just a one-time embedder load (`cce serve` RSS jumped 310 → 421 MB) with no forkserver pool. For a project the size of mine I'd expect it to spawn the same 4-worker pool that `cce index` does — same code path. I didn't directly measure that case because I didn't want to risk OOMing my WSL while testing.

The user-facing problem is the silence. From the MCP client's side it looks like `context_search` is "taking unusually long" with no progress indicator and no warning that an indexing pass is happening underneath. A response like `"Index empty; indexing in background, retry in ~N seconds"` would be much friendlier than blocking silently.

**Small related thing: `cce search` (CLI) doesn't inherit `FASTEMBED_CACHE_PATH` from `cce serve`'s env.** If I set the env in opencode's `environment` block for the MCP config (so `cce serve` uses my persistent cache), `cce search` from my own shell still falls back to `/tmp` unless I also export the var globally. Minor footgun, but it caught me out when I was testing.

### What did you expect?

1. Hook installed by `cce init` to work on first commit (drop the dead `--changed-only` flag).
2. Model download to bound its wait and fail loudly with a useful error instead of hanging indefinitely.
3. Cache to survive reboot (default to `$HOME/.cache/fastembed` or read `FASTEMBED_CACHE_PATH` from CCE config explicitly).
4. Hook script to either probe the port for liveness first, or for `cce serve` to clean up its `serve.port` on exit.
5. First `context_search` on an empty index to either fail with a clear "run `cce init` first" message, or to return an immediate "indexing in background" response instead of blocking.

### Steps to reproduce

**For #1** — clean repro, any project:

```bash
$ cd /any/project && cce init
$ cat .git/hooks/post-commit
#!/bin/sh
# cce hook
/path/to/cce index --changed-only >/dev/null 2>&1 &
$ cce index --changed-only
Error: No such option: --changed-only
```

**For #2 / #3** — depends on network luck:

```bash
$ rm -rf /tmp/fastembed_cache       # or wait for WSL to reboot
$ cce serve --project-dir /any/project
# most of the time this works in a few seconds
# sometimes it hangs at "Fetching 5 files: 20%" with ss showing
# established connections to HF CDN but zero progress
```

When it does hang, the leftover cache contents will look like:
```
blobs/0d7726d0... (config.json, 706 B)
blobs/688882a7... (tokenizer.json, 711 KB)
blobs/75305659... (tokenizer_config.json, 1.2 KB)
blobs/9bbecc17... (special_tokens_map.json, 695 B)
blobs/51f1bd0...incomplete (0 B)   ← the missing ONNX
```

Every subsequent CCE invocation will then crash on the missing ONNX until manually cleaned.

**For #4** — easy:

```bash
# start any cce serve, kill it
cce serve --project-dir /any/project &
SERVE_PID=$!
sleep 5
kill -9 $SERVE_PID
# serve.port still in ~/.cce/projects/<basename>/

time ~/.cce/hooks/cce_hook.sh PostToolUse < /dev/null
# ~1 second wasted per call
```

**For #5** — empty project:

```bash
mkdir -p /tmp/empty-test/src
echo "def foo(): pass" > /tmp/empty-test/src/a.py
cce serve --project-dir /tmp/empty-test
# don't run cce init — leave index empty
# from another shell or MCP client, call tools/call: context_search
# watch the request block while run_indexing runs silently
```

### Relevant logs or error output

Debug logs from the investigation (gist):
https://gist.github.com/AZagatti/7393f669a0fd785d7153e07a52a11127

Most relevant for this issue:
- `01-syn-sent-hang.log` — first stuck-download repro (IPv4 SYN-SENT)
- `02-stuck-on-broken-cache.log` — second one, 5 ESTABLISHED IPv6 sockets and 0 bytes
- `05-healthy-serve-startup.log` — what a healthy startup looks like once the model is pre-staged

### Python version

3.13.5

### OS

Ubuntu 24.04 LTS on WSL2 (kernel `6.6.87.2-microsoft-standard-WSL2`)

### CCE version

0.4.19


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Setup / first-run issues: broken post-commit hook, model download stalls, /tmp cache wipe, silent lazy index #67

What happened?

What did you expect?

Steps to reproduce

Relevant logs or error output

Python version

OS

CCE version

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Setup / first-run issues: broken post-commit hook, model download stalls, /tmp cache wipe, silent lazy index #67

Description

What happened?

What did you expect?

Steps to reproduce

Relevant logs or error output

Python version

OS

CCE version

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions