Skip to content

Auto-Compact Based on Model Context Window #67

@theepicsaxguy

Description

@theepicsaxguy

Goal: Automatically compact sessions when the context window nears its limit to prevent truncation or errors.


1. Define Triggers and Thresholds

  • Context Usage Threshold:
    • Trigger auto-compact when context usage exceeds 80% of the model’s context window (configurable in Settings).
    • Example: For a 128K-token model, auto-compact at 102K tokens.
  • Session Length Threshold:
    • Optional: Auto-compact if the session exceeds 50 messages (configurable).
  • Idle Trigger:
    • Auto-compact when the user navigates away from the session or after 5 minutes of inactivity.

2. Backend Changes

  • Context Monitoring:
    • Extend the existing ContextUsageIndicator [1] to emit events when the threshold is crossed.
    • Use the summarizeSession endpoint to trigger compaction.
  • Compaction Logic:
    • Reuse the existing /compact slash command logic but invoke it automatically.
    • Log compaction events for debugging (e.g., "Session auto-compacted at 82% context usage").
  • Model Awareness:
    • Fetch the model’s context window from the provider’s metadata (e.g., models[modelID].limit.context ).
    • Fall back to a default (e.g., 128K) if metadata is missing.

3. Frontend UX

  • Non-Blocking Toast:
    • Show a toast when auto-compaction starts: "Compacting session to save context space...".
    • Show a success toast: "Session compacted. Tokens reduced from X to Y."
  • Manual Override:
    • Add a "Disable Auto-Compact" toggle in Settings.
    • Add a "Compact Now" button in the session header for manual compaction.
  • Visual Feedback:
    • Animate the ContextUsageIndicator when auto-compaction is triggered (e.g., pulsing red at 90%+ usage).

4. Edge Cases

  • Failed Compaction:
    • If compaction fails, show an error toast: "Auto-compact failed. Try manually or reduce context."
    • Retry once after 30 seconds.
  • Rate Limiting:
    • Prevent multiple auto-compacts in a short time (e.g., 1 per 5 minutes).
  • Offline Mode:
    • Skip auto-compact if the user is offline (queue for later).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions