Problem
The JSONL file watcher (server/watcher.py) tracks read positions per file using a simple offset dictionary:
# Line 16
_file_positions: dict[str, int] = {}
When reading new content, it seeks to the last known position and reads forward:
# Lines 265-269
with open(file_path, "r", encoding="utf-8", errors="replace") as f:
f.seek(last_pos)
new_lines = f.readlines()
_file_positions[file_path] = f.tell()
The problem: When Claude Code compacts a session, it rewrites the JSONL file. The file's structure changes — it may be shorter than the previously recorded position, entries may be consolidated, and the byte offsets shift entirely.
What happens today during compaction
- Claude Code rewrites/truncates the JSONL file (the file becomes shorter)
watchfiles (using kqueue on macOS / inotify on Linux) detects the modification
_schedule_process() is called, which debounces for 1 second (DEBOUNCE_SECONDS = 1.0 at line 19)
_process_file_changes() runs:
last_pos is the old position (e.g., byte 50000)
f.seek(50000) on a file that's now only 20000 bytes
f.readlines() returns an empty list (we're past EOF)
_file_positions[file_path] is updated to the end of the shorter file
- Result: All new content in the compacted file is silently missed. The watcher thinks it's caught up, but it skipped everything.
Additional compaction scenarios
- File replacement (write to temp, rename): On some systems,
watchfiles may see this as a Change.added event for the new file. The old position is still stored under the same path key, leading to the same seek-past-EOF issue.
- Multiple rapid compactions: If compaction happens during the debounce window, only the final state is processed — intermediate states are lost (which is actually fine for compaction, but the position tracking is still broken).
What needs to happen
1. Detect file truncation/rewrite
Before seeking to last_pos, check if the file has been truncated:
import os
file_size = os.path.getsize(file_path)
last_pos = _file_positions.get(file_path, 0)
if file_size < last_pos:
# File was truncated/rewritten — reset position and re-read from start
logger.info("File %s appears compacted (size %d < last_pos %d), re-reading",
file_path, file_size, last_pos)
last_pos = 0
This is the minimum fix. If file_size < last_pos, the file has been rewritten and we need to re-read from the beginning.
2. Handle duplicate transcript entries on re-read
When we reset to position 0 and re-read the entire compacted file, we'll encounter entries that were already stored in the database from previous reads. The transcript storage logic needs to handle this gracefully:
- Option A: Use upsert/ignore semantics when writing transcripts — if a transcript entry with the same session_id + timestamp + content already exists, skip it.
- Option B: Clear existing transcripts for the session before re-ingesting (destructive, simpler).
- Option C: Track a content hash or message ID alongside the position, and use that to detect which entries are new.
Recommendation: Option A (upsert/ignore) is the safest and most general approach.
3. Handle file-history-snapshot entries
The JSONL parser (watcher.py:25-124) currently skips "file-history-snapshot" type entries. After compaction, Claude Code may include different metadata entries. Verify that the parser handles any new entry types that appear in compacted files gracefully (skips unknown types without crashing).
4. Test coverage
- Truncation test: Write a JSONL file, process it, truncate and rewrite with new content, process again — verify new content is captured and position is reset.
- Shorter-file test: Write a large JSONL, process it, replace with a shorter JSONL, process — verify no data is silently lost.
- Duplicate handling test: Process a file, simulate compaction that keeps some of the same entries, re-process — verify no duplicate transcripts in the database.
- Rapid compaction test: Trigger multiple file rewrites within the debounce window (1 second) — verify the final state is correctly captured.
5. Logging and observability
Add info-level logging when a compaction is detected so operators can correlate any dashboard anomalies with compaction events. Include the old position, new file size, and session ID.
Acceptance criteria
Technical context
- Watcher uses
watchfiles library (awatch) with recursive monitoring of ~/.claude/projects/
- Only
Change.added and Change.modified events are processed (line 399)
- 1-second debounce per file path (
watcher.py:18-20, 368-381)
- Position tracking is in-memory only — lost on server restart (which is fine, but means restart also triggers a full re-read)
Problem
The JSONL file watcher (
server/watcher.py) tracks read positions per file using a simple offset dictionary:When reading new content, it seeks to the last known position and reads forward:
The problem: When Claude Code compacts a session, it rewrites the JSONL file. The file's structure changes — it may be shorter than the previously recorded position, entries may be consolidated, and the byte offsets shift entirely.
What happens today during compaction
watchfiles(using kqueue on macOS / inotify on Linux) detects the modification_schedule_process()is called, which debounces for 1 second (DEBOUNCE_SECONDS = 1.0at line 19)_process_file_changes()runs:last_posis the old position (e.g., byte 50000)f.seek(50000)on a file that's now only 20000 bytesf.readlines()returns an empty list (we're past EOF)_file_positions[file_path]is updated to the end of the shorter fileAdditional compaction scenarios
watchfilesmay see this as aChange.addedevent for the new file. The old position is still stored under the same path key, leading to the same seek-past-EOF issue.What needs to happen
1. Detect file truncation/rewrite
Before seeking to
last_pos, check if the file has been truncated:This is the minimum fix. If
file_size < last_pos, the file has been rewritten and we need to re-read from the beginning.2. Handle duplicate transcript entries on re-read
When we reset to position 0 and re-read the entire compacted file, we'll encounter entries that were already stored in the database from previous reads. The transcript storage logic needs to handle this gracefully:
Recommendation: Option A (upsert/ignore) is the safest and most general approach.
3. Handle
file-history-snapshotentriesThe JSONL parser (
watcher.py:25-124) currently skips"file-history-snapshot"type entries. After compaction, Claude Code may include different metadata entries. Verify that the parser handles any new entry types that appear in compacted files gracefully (skips unknown types without crashing).4. Test coverage
5. Logging and observability
Add info-level logging when a compaction is detected so operators can correlate any dashboard anomalies with compaction events. Include the old position, new file size, and session ID.
Acceptance criteria
Technical context
watchfileslibrary (awatch) with recursive monitoring of~/.claude/projects/Change.addedandChange.modifiedevents are processed (line 399)watcher.py:18-20, 368-381)