Skip to content

fix: setup and first-run pain points (closes #67)#70

Merged
rajkumarsakthivel merged 5 commits into
mainfrom
fix/67-setup-and-first-run
May 12, 2026
Merged

fix: setup and first-run pain points (closes #67)#70
rajkumarsakthivel merged 5 commits into
mainfrom
fix/67-setup-and-first-run

Conversation

@fazleelahhee
Copy link
Copy Markdown
Contributor

Closes #67. Lands the four most actionable items from the bundle in one
PR (per Raj's triage); leaving #2's full download-timeout work for a
separate follow-up since it touches the slow path and warrants its own
discussion of retry/backoff strategy.

Fixes in this PR

  1. Drop --changed-only from the post-commit hook template
    indexer/git_hooks.py. The flag was removed from cce index but
    the hook template still referenced it, so every commit silently
    errored. cce index (no flag) already does incremental indexing.

  2. Persistent fastembed cache + .incomplete sweep
    indexer/embedder.py. Three folded problems:

    • Default cache moves from \$TMPDIR/fastembed_cache to
      \$XDG_CACHE_HOME/fastembed (or ~/.cache/fastembed) — survives
      WSL's systemd-tmpfiles /tmp wipe.
    • New CCE_FASTEMBED_CACHE_PATH env var (priority over
      FASTEMBED_CACHE_PATH) for users isolating CCE's cache.
    • _sweep_incomplete_downloads() at Embedder init removes any
      *.incomplete blobs from a stalled huggingface_hub download so
      one flaky network event can't crash every subsequent run with
      NO_SUCHFILE.
    • Error message now points at the cache dir and suggests rm -rf
      recovery.
  3. TCP liveness probe in hook scripts
    memory/hook_installer.py. POSIX adds a bash /dev/tcp probe;
    Windows adds a PowerShell TcpClient probe. A stale serve.port
    no longer wastes 1-2s/curl call.

  4. Empty-index MCP returns a status response instead of blocking
    integration/mcp_server.py. First context_search on an empty
    index used to silently run a full reindex and only respond when it
    finished. Now kicks the index off as a background task and returns
    an immediate status line.

Deferred to a follow-up

  • feat(memory): claude-mem feature parity — series of 5 PRs #2 from the issue — full download timeout/retry on
    huggingface_hub
    . This PR addresses the sticky part of that
    failure (the .incomplete cleanup means one stall no longer
    poisons every future run) but bounding the download itself with
    retry/backoff is a separate change.

Tests

  • test_git_hooks.py +1 case
  • test_embedder.py +6 cases (cache dir precedence, sweep behaviour)
  • test_hook_installer.py +2 cases (POSIX + Windows probe present
    before any curl)
  • tests/integration/test_mcp_empty_index.py (new, 3 cases)

Verification

  • pytest -n 4887 passed, 1 skipped, 0 failed
  • ruff check on every changed file → clean

Test plan

  • Fresh WSL: rm -rf ~/.cache/fastembed; cce serve — model lands
    in ~/.cache/fastembed, survives reboot
  • Mid-download ^C, restart cce serve.incomplete gets
    swept, download retries cleanly
  • cce serve, kill with -9, time a hook event — under 100ms
    instead of 1-2s
  • cce serve --project-dir /tmp/empty then context_search over
    MCP — returns "Index is empty, indexing started" immediately
    instead of blocking
  • cce init && git commit --allow-empty — post-commit runs
    cce index cleanly with no errors in 2>&1 capture

Five connected setup-path bugs from AZagatti's report. All live in the
install / first-run / hook path, all silent failures that look like
working features.

1. Drop --changed-only from the post-commit hook template
   (`indexer/git_hooks.py`). The flag no longer exists on `cce index`,
   so every commit silently errored with
   "No such option: --changed-only". `cce index` (no flag) already does
   incremental indexing via the manifest's content-hash check, so the
   hook just needs the flag dropped.

2. Persistent fastembed cache + .incomplete sweep
   (`indexer/embedder.py`). Three problems folded together:
   - Default cache moves from `$TMPDIR/fastembed_cache` to
     `$XDG_CACHE_HOME/fastembed` (or `~/.cache/fastembed`) so it
     survives WSL/Ubuntu's systemd-tmpfiles `/tmp` wipe on every boot.
   - New `CCE_FASTEMBED_CACHE_PATH` env var (precedence over
     `FASTEMBED_CACHE_PATH`) lets users with multiple fastembed tools
     isolate CCE's cache.
   - `_sweep_incomplete_downloads()` runs at Embedder init and removes
     any `*.incomplete` blobs left behind by a stalled
     huggingface_hub download. Without this, one flaky network event
     produces a zero-byte ONNX file that crashes every subsequent run
     with NO_SUCHFILE until the user manually nukes the cache.
   - Embedder error message now points at the cache dir and suggests
     the rm-rf recovery so the failure mode is at least debuggable.

3. TCP liveness probe in hook scripts (`memory/hook_installer.py`).
   POSIX hook adds a bash /dev/tcp probe; Windows .cmd hook adds a
   PowerShell TcpClient probe. A stale `serve.port` no longer burns
   1-2s/curl call when nothing's listening — which on long Claude
   Code sessions with hundreds of PostToolUse/UserPromptSubmit events
   adds up to many minutes of dead wait.

4. Empty-index MCP returns a status response instead of blocking
   (`integration/mcp_server.py`). First `context_search` on an empty
   index used to silently run a full reindex and only respond when it
   finished — from the MCP client's side it looked indistinguishable
   from a hang. The new behaviour kicks indexing off as a background
   task and immediately returns a one-line status telling the user
   to retry or run `cce index`.

Tests:
- test_git_hooks.py: asserts --changed-only no longer appears in any
  installed hook
- test_embedder.py +6 cases: cache dir resolution order
  (CCE_FASTEMBED_CACHE_PATH > FASTEMBED_CACHE_PATH > XDG_CACHE_HOME >
  ~/.cache), .incomplete sweep removes stale partials but preserves
  good cache files, missing-dir is a no-op
- test_hook_installer.py +2 cases: POSIX and Windows hook templates
  contain a TCP liveness probe before any curl call
- tests/integration/test_mcp_empty_index.py (new, 3 cases): empty
  index returns a status text, populated index runs real retrieval,
  _ensure_indexed returns the right boolean for each state

Full suite: 887 passed, 1 skipped, 0 failed. Ruff clean.
The Embedder now defaults its fastembed cache to ~/.cache/fastembed
(#67). CI's "Warm fastembed model cache" step still downloaded to
fastembed's natural default (/tmp/fastembed_cache), so tests came up
with my cache_dir empty — under pytest-xdist's parallel workers,
several started concurrent downloads to that fresh dir and produced
partial-snapshot races that ended in NO_SUCHFILE on the ONNX load.

Both the warm step and the test step now point at
\$RUNNER_TEMP/fastembed-cache via FASTEMBED_CACHE_PATH. fastembed
itself honours that var, so the warm step still writes via the natural
TextEmbedding default, and the tests' Embedder resolves to the same
directory through the new _resolve_cache_dir() chain.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses several setup/first-run pain points (issue #67) by fixing broken git hook templates, making fastembed’s cache persistent and self-healing, reducing hook latency when the server port is stale, and ensuring MCP context_search on an empty index returns immediately with a status message while indexing runs in the background.

Changes:

  • Update git hook templates to call cce index without the removed --changed-only flag.
  • Add persistent fastembed cache resolution + .incomplete cleanup and improve load error messaging.
  • Add TCP liveness probes to hook scripts and make MCP empty-index searches non-blocking via background indexing.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
src/context_engine/indexer/git_hooks.py Removes --changed-only from hook script template.
src/context_engine/indexer/embedder.py Adds persistent cache dir resolution, sweeps *.incomplete, passes cache_dir to fastembed, improves error message.
src/context_engine/memory/hook_installer.py Adds POSIX/Windows TCP liveness probes before curl.
src/context_engine/integration/mcp_server.py Changes lazy indexing to run in background and returns a status response when index is empty.
tests/indexer/test_git_hooks.py Adds regression test ensuring hooks don’t reference --changed-only.
tests/indexer/test_embedder.py Adds tests for cache-dir precedence and .incomplete sweeping behavior.
tests/memory/test_hook_installer.py Adds tests asserting TCP probe exists and runs before curl.
tests/integration/test_mcp_empty_index.py Adds integration coverage for empty-index MCP behavior and _ensure_indexed() return value.
.github/workflows/ci.yml Pins FASTEMBED_CACHE_PATH so cache warming and tests share the same directory.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/context_engine/indexer/embedder.py Outdated
Comment thread src/context_engine/indexer/embedder.py Outdated
Comment thread src/context_engine/memory/hook_installer.py
Comment thread src/context_engine/memory/hook_installer.py
Comment thread tests/integration/test_mcp_empty_index.py
Comment thread src/context_engine/indexer/git_hooks.py Outdated
Six review findings, all resolved:

1. Security: validate PORT before shell interpolation (POSIX). A
   corrupted/hostile serve.port containing $(...) or backticks would
   be evaluated by the shell when bash -c interpolates "${PORT}".
   Refuse anything that isn't a digit-only string in the valid TCP
   range (1-65535) before any shell expansion.

2. Security: same validation on the Windows .cmd hook before the
   PowerShell -Command line interpolates %PORT%. findstr regex
   enforces digits-only; numeric range checks cap the value.

3. Shell-quote bin_path in git_hooks. Resolved binary paths
   commonly contain spaces (Windows user profiles, macOS users
   with a space in their name) and the unquoted path tokenised at
   the space. shlex.quote handles both cases.

4. _resolve_cache_dir docstring listed FASTEMBED_CACHE_PATH as
   priority 1 but the code checks CCE_FASTEMBED_CACHE_PATH first.
   Reordered the doc to match the implementation.

5. Replace `rm -rf {cache_dir}` recovery hint with a platform-neutral
   "delete the cache directory" message.

6. test_mcp_empty_index recorded `indexing_called` but never asserted
   on it. Added a `await asyncio.sleep(0)` to let the create_task'd
   coroutine run, then assert the list contains exactly one True.

New regression tests:
- test_posix_hook_rejects_non_numeric_port
- test_windows_hook_rejects_non_numeric_port
- test_post_commit_hook_quotes_bin_path_with_spaces

890 passed, 1 skipped, 0 failed. Ruff clean.
Copy link
Copy Markdown
Member

@rajkumarsakthivel rajkumarsakthivel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. All 4 actionable items from #67 landed.

Hook template fix, persistent cache with .incomplete sweep, TCP liveness probe (both POSIX and Windows), and non-blocking empty-index response. Download timeout deferred to a follow-up is the right call. Good test coverage.

@rajkumarsakthivel rajkumarsakthivel merged commit e77dbea into main May 12, 2026
10 checks passed
@rajkumarsakthivel rajkumarsakthivel deleted the fix/67-setup-and-first-run branch May 12, 2026 19:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Setup / first-run issues: broken post-commit hook, model download stalls, /tmp cache wipe, silent lazy index

3 participants