-
Notifications
You must be signed in to change notification settings - Fork 26
Open
Description
Goal: Automatically compact sessions when the context window nears its limit to prevent truncation or errors.
1. Define Triggers and Thresholds
- Context Usage Threshold:
- Trigger auto-compact when context usage exceeds 80% of the model’s context window (configurable in Settings).
- Example: For a 128K-token model, auto-compact at 102K tokens.
- Session Length Threshold:
- Optional: Auto-compact if the session exceeds 50 messages (configurable).
- Idle Trigger:
- Auto-compact when the user navigates away from the session or after 5 minutes of inactivity.
2. Backend Changes
- Context Monitoring:
- Extend the existing
ContextUsageIndicator[1] to emit events when the threshold is crossed. - Use the
summarizeSessionendpoint to trigger compaction.
- Extend the existing
- Compaction Logic:
- Reuse the existing
/compactslash command logic but invoke it automatically. - Log compaction events for debugging (e.g., "Session auto-compacted at 82% context usage").
- Reuse the existing
- Model Awareness:
- Fetch the model’s context window from the provider’s metadata (e.g.,
models[modelID].limit.context). - Fall back to a default (e.g., 128K) if metadata is missing.
- Fetch the model’s context window from the provider’s metadata (e.g.,
3. Frontend UX
- Non-Blocking Toast:
- Show a toast when auto-compaction starts: "Compacting session to save context space...".
- Show a success toast: "Session compacted. Tokens reduced from X to Y."
- Manual Override:
- Add a "Disable Auto-Compact" toggle in Settings.
- Add a "Compact Now" button in the session header for manual compaction.
- Visual Feedback:
- Animate the
ContextUsageIndicatorwhen auto-compaction is triggered (e.g., pulsing red at 90%+ usage).
- Animate the
4. Edge Cases
- Failed Compaction:
- If compaction fails, show an error toast: "Auto-compact failed. Try manually or reduce context."
- Retry once after 30 seconds.
- Rate Limiting:
- Prevent multiple auto-compacts in a short time (e.g., 1 per 5 minutes).
- Offline Mode:
- Skip auto-compact if the user is offline (queue for later).
Metadata
Metadata
Assignees
Labels
No labels
Projects
Status
No status