layout	default
title	Memory & Sessions
parent	OpenClaw Deep Dive
nav_order	5

Chapter 5: Memory & Sessions

Welcome to Chapter 5: Memory & Sessions. In this part of OpenClaw: Deep Dive Tutorial, you will build an intuitive mental model first, then move into concrete implementation details and practical production tradeoffs.

Introduction

One of OpenClaw's defining features is persistent memory — the assistant remembers your preferences, past conversations, and important facts across sessions and even across channels. This chapter explores the memory architecture, session management, and context strategies that make this possible.

Memory Architecture

graph TB
    subgraph ShortTerm["Short-Term (Session Context)"]
        CW[Context Window<br/>Recent messages]
        WM[Working Memory<br/>Current task state]
    end

    subgraph LongTerm["Long-Term (Persistent)"]
        FM[Fact Memory<br/>User preferences, facts]
        EM[Episodic Memory<br/>Past conversations]
        SM[Semantic Memory<br/>Learned concepts]
    end

    subgraph Retrieval["Retrieval"]
        EMB[Embedding Search]
        KW[Keyword Match]
        REC[Recency Bias]
        REL[Relevance Scoring]
    end

    CW --> EMB
    WM --> KW
    EMB --> FM
    EMB --> EM
    KW --> SM
    REC --> REL
    REL --> CW

Memory Types

Short-Term: Context Window

The context window holds the most recent conversation turns:

interface ContextWindow {
  messages: Message[];
  max_tokens: number;       // e.g., 128,000 for Claude
  current_tokens: number;
  summary: string | null;   // Compressed older context
}

class ContextWindowManager {
  private windows: Map<string, ContextWindow> = new Map();

  getOrCreate(sessionId: string): ContextWindow {
    if (!this.windows.has(sessionId)) {
      this.windows.set(sessionId, {
        messages: [],
        max_tokens: this.config.max_context_tokens,
        current_tokens: 0,
        summary: null,
      });
    }
    return this.windows.get(sessionId)!;
  }

  async addMessage(sessionId: string, message: Message) {
    const window = this.getOrCreate(sessionId);
    const tokens = this.countTokens(message.content);

    window.messages.push(message);
    window.current_tokens += tokens;

    // If approaching limit, compress older messages
    if (window.current_tokens > window.max_tokens * 0.8) {
      await this.compress(sessionId);
    }
  }

  /**
   * Compress older messages into a summary to free token budget.
   * Keeps the most recent messages intact for coherent conversation.
   */
  private async compress(sessionId: string) {
    const window = this.getOrCreate(sessionId);
    const keepRecent = 20; // Keep last 20 messages verbatim

    if (window.messages.length <= keepRecent) return;

    // Messages to summarize
    const toSummarize = window.messages.slice(
      0,
      window.messages.length - keepRecent
    );

    // Generate summary via LLM
    const summary = await this.llm.call({
      system: "Summarize this conversation concisely, preserving key facts, decisions, and context needed for continuity.",
      messages: [{
        role: "user",
        content: toSummarize.map(m =>
          `[${m.role}]: ${m.content}`
        ).join("\n"),
      }],
    });

    // Replace old messages with summary
    window.summary = window.summary
      ? `${window.summary}\n\n${summary.text}`
      : summary.text;

    window.messages = window.messages.slice(-keepRecent);
    window.current_tokens = this.countTokens(
      window.summary + window.messages.map(m => m.content).join("")
    );
  }
}

Long-Term: Fact Memory

Facts are discrete pieces of information about users:

interface FactEntry {
  id: string;
  user_id: string;
  fact: string;              // "Prefers dark mode"
  category: FactCategory;
  importance: number;        // 1-10
  confidence: number;        // 0-1
  source_session: string;
  created_at: number;
  last_accessed: number;
  access_count: number;
  embedding: number[];       // For semantic search
}

enum FactCategory {
  PREFERENCE = "preference",     // "Likes Python over JS"
  PERSONAL = "personal",         // "Lives in NYC"
  WORK = "work",                 // "Works at Acme Corp"
  HABIT = "habit",               // "Usually works late"
  RELATIONSHIP = "relationship", // "Has a dog named Max"
  TECHNICAL = "technical",       // "Uses VS Code"
  SCHEDULE = "schedule",         // "Meeting every Tuesday 2pm"
}

class FactMemory {
  private db: Database;
  private embedder: Embedder;

  async store(entry: Omit<FactEntry, "id" | "embedding">) {
    // Generate embedding for semantic search
    const embedding = await this.embedder.embed(entry.fact);

    // Check for contradictions with existing facts
    const existing = await this.findSimilar(entry.user_id, entry.fact);
    if (existing && existing.similarity > 0.9) {
      // Update existing fact instead of duplicating
      await this.update(existing.id, {
        fact: entry.fact,
        confidence: Math.max(existing.confidence, entry.confidence),
        last_accessed: Date.now(),
      });
      return;
    }

    // Store new fact
    await this.db.insert("facts", {
      ...entry,
      id: crypto.randomUUID(),
      embedding,
      created_at: Date.now(),
      last_accessed: Date.now(),
      access_count: 0,
    });
  }

  async retrieve(query: RetrievalQuery): Promise<FactEntry[]> {
    const queryEmbedding = await this.embedder.embed(query.query);

    // Semantic search with relevance scoring
    const results = await this.db.query(`
      SELECT *,
        (1 - cosine_distance(embedding, $1)) as similarity,
        importance * 0.3 +
        similarity * 0.5 +
        recency_score(last_accessed) * 0.2 as relevance
      FROM facts
      WHERE user_id = $2
        AND similarity > $3
      ORDER BY relevance DESC
      LIMIT $4
    `, [queryEmbedding, query.user_id, query.relevance_threshold, query.limit]);

    // Update access metadata
    for (const result of results) {
      await this.db.update("facts", result.id, {
        last_accessed: Date.now(),
        access_count: result.access_count + 1,
      });
    }

    return results;
  }
}

Long-Term: Episodic Memory

Episodic memory stores summaries of past conversation episodes:

interface Episode {
  id: string;
  user_id: string;
  session_id: string;
  summary: string;
  topics: string[];
  key_decisions: string[];
  outcome: string;
  timestamp: number;
  duration_minutes: number;
  message_count: number;
  embedding: number[];
}

class EpisodicMemory {
  /**
   * When a session ends or goes idle, create an episode summary.
   */
  async createEpisode(session: Session) {
    const summary = await this.llm.call({
      system: `Summarize this conversation as a past episode. Include:
               - What was discussed
               - Key decisions made
               - Any action items or outcomes
               - Emotional tone
               Format as concise narrative.`,
      messages: [{
        role: "user",
        content: session.context_window
          .map(m => `[${m.role}]: ${m.content}`)
          .join("\n"),
      }],
    });

    const topics = await this.extractTopics(session);

    await this.store({
      user_id: session.participants[0],
      session_id: session.id,
      summary: summary.text,
      topics,
      key_decisions: await this.extractDecisions(session),
      outcome: await this.extractOutcome(session),
      timestamp: session.created_at,
      duration_minutes: (Date.now() - session.created_at) / 60_000,
      message_count: session.context_window.length,
    });
  }

  /**
   * Retrieve relevant past episodes for context enrichment.
   */
  async recall(query: string, userId: string): Promise<Episode[]> {
    const embedding = await this.embedder.embed(query);

    return this.db.query(`
      SELECT *, (1 - cosine_distance(embedding, $1)) as relevance
      FROM episodes
      WHERE user_id = $2
      ORDER BY relevance DESC
      LIMIT 5
    `, [embedding, userId]);
  }
}

Session Management

Session Types

enum SessionMode {
  /**
   * MAIN: Direct 1:1 conversation.
   * One session per user per channel.
   * Full tool access, persistent memory.
   */
  MAIN = "main",

  /**
   * GROUP: Channel/group chat.
   * One session per group. Activated by mention or reply.
   * Limited tool access (configurable per group).
   * Docker sandbox for code execution.
   */
  GROUP = "group",

  /**
   * QUEUE: Sequential processing mode.
   * Messages processed one at a time in order.
   * Used for high-throughput channels or automation.
   */
  QUEUE = "queue",
}

Session Persistence

Sessions persist across restarts:

class SessionStore {
  private db: Database;
  private cache: Map<string, Session> = new Map();

  async get(sessionId: string): Promise<Session | null> {
    // Check in-memory cache first
    if (this.cache.has(sessionId)) {
      return this.cache.get(sessionId)!;
    }

    // Load from disk
    const data = await this.db.get("sessions", sessionId);
    if (!data) return null;

    const session = this.deserialize(data);
    this.cache.set(sessionId, session);
    return session;
  }

  async save(session: Session) {
    this.cache.set(session.id, session);

    // Persist to disk (debounced — doesn't write on every message)
    this.debouncedWrite(session.id, this.serialize(session));
  }

  /**
   * Prune old sessions to free storage.
   */
  async prune(maxAge: number) {
    const cutoff = Date.now() - maxAge;

    const expired = await this.db.query(
      "SELECT id FROM sessions WHERE last_active < $1",
      [cutoff]
    );

    for (const { id } of expired) {
      // Create episodic memory before pruning
      const session = await this.get(id);
      if (session) {
        await this.episodicMemory.createEpisode(session);
      }

      await this.db.delete("sessions", id);
      this.cache.delete(id);
    }

    return expired.length;
  }
}

Session Isolation (Groups)

Group sessions have stricter isolation:

class GroupSessionManager {
  /**
   * Create an isolated session for a group chat.
   * Each group gets its own context, tools, and permissions.
   */
  createGroupSession(
    channel: string,
    groupId: string
  ): Session {
    return {
      id: `group:${channel}:${groupId}`,
      mode: SessionMode.GROUP,
      channel,
      participants: [],
      created_at: Date.now(),
      last_active: Date.now(),
      context_window: [],
      memory_refs: [],
      tool_permissions: this.getGroupPermissions(groupId),
      isolation: {
        // Code execution in Docker sandbox
        sandbox: true,
        // No access to personal files
        file_access: false,
        // Limited browser access
        browser_access: "read_only",
        // No access to other users' memories
        cross_user_memory: false,
      },
    };
  }

  private getGroupPermissions(groupId: string): string[] {
    // Default group permissions — more restrictive than 1:1
    return [
      "web_search",
      "web_browse",
      "code_execute_sandboxed",
      "weather",
      "calculator",
    ];
    // Notably excludes: file_system, email, calendar, personal_integrations
  }
}

Memory Retrieval Pipeline

When building context for an LLM call, relevant memories are retrieved:

class MemoryRetriever {
  private factMemory: FactMemory;
  private episodicMemory: EpisodicMemory;

  /**
   * Retrieve all relevant memories for a given query and user.
   * Combines fact memory, episodic memory, and recency signals.
   */
  async retrieve(params: {
    query: string;
    userId: string;
    sessionId: string;
    limit: number;
  }): Promise<MemoryContext> {
    // Parallel retrieval from all memory stores
    const [facts, episodes] = await Promise.all([
      this.factMemory.retrieve({
        query: params.query,
        user_id: params.userId,
        limit: params.limit,
        relevance_threshold: 0.6,
      }),
      this.episodicMemory.recall(params.query, params.userId),
    ]);

    // Format for injection into system prompt
    return {
      facts: facts.map(f => `- ${f.fact} (${f.category})`).join("\n"),
      episodes: episodes.map(e =>
        `[${new Date(e.timestamp).toLocaleDateString()}] ${e.summary}`
      ).join("\n\n"),
      formatted: this.formatForPrompt(facts, episodes),
    };
  }

  private formatForPrompt(
    facts: FactEntry[],
    episodes: Episode[]
  ): string {
    let prompt = "";

    if (facts.length > 0) {
      prompt += "## What I Know About You\n";
      prompt += facts.map(f => `- ${f.fact}`).join("\n");
      prompt += "\n\n";
    }

    if (episodes.length > 0) {
      prompt += "## Recent Conversations\n";
      prompt += episodes.map(e =>
        `- ${new Date(e.timestamp).toLocaleDateString()}: ${e.summary}`
      ).join("\n");
    }

    return prompt;
  }
}

Memory Lifecycle

graph LR
    MSG[New Message] --> EXTRACT[Extract Facts]
    EXTRACT --> DEDUP[Deduplication]
    DEDUP --> STORE[Store in Memory]
    STORE --> INDEX[Embed & Index]

    QUERY[New Query] --> RETRIEVE[Retrieve Relevant]
    RETRIEVE --> RANK[Rank by Relevance]
    RANK --> INJECT[Inject into Context]

    IDLE[Session Idle] --> EPISODE[Create Episode]
    EPISODE --> ARCHIVE[Archive Session]

    PRUNE[Prune Job] --> DECAY[Decay Low-Access]
    DECAY --> DELETE[Delete Stale]

Memory Decay

Memories that aren't accessed gradually decay:

class MemoryDecayManager {
  /**
   * Run periodic decay to prevent memory bloat.
   * Low-importance, rarely-accessed memories fade.
   */
  async runDecay() {
    const allFacts = await this.db.query(
      "SELECT * FROM facts WHERE last_accessed < $1",
      [Date.now() - 30 * 24 * 60 * 60 * 1000] // 30 days
    );

    for (const fact of allFacts) {
      const decayFactor = this.calculateDecay(fact);

      if (decayFactor < 0.1) {
        // Memory has decayed below threshold — archive
        await this.archive(fact);
        await this.db.delete("facts", fact.id);
      } else {
        // Reduce confidence
        await this.db.update("facts", fact.id, {
          confidence: fact.confidence * decayFactor,
        });
      }
    }
  }

  private calculateDecay(fact: FactEntry): number {
    const daysSinceAccess =
      (Date.now() - fact.last_accessed) / (24 * 60 * 60 * 1000);

    // Exponential decay modified by importance and access frequency
    const baseDecay = Math.exp(-daysSinceAccess / 90); // 90-day half-life
    const importanceBoost = fact.importance / 10;       // 0-1
    const accessBoost = Math.min(fact.access_count / 20, 1); // Cap at 20

    return baseDecay * (1 + importanceBoost + accessBoost) / 3;
  }
}

Cross-Channel Memory

A key feature: memories persist across channels. A fact learned in WhatsApp is available in Telegram:

class CrossChannelMemory {
  /**
   * User identity is unified across channels.
   * The same user on WhatsApp and Telegram shares one memory store.
   */
  async resolveUserId(
    channel: string,
    channelUserId: string
  ): Promise<string> {
    // Check if this channel user is linked to a unified ID
    const mapping = await this.db.get("user_mappings", {
      channel,
      channel_user_id: channelUserId,
    });

    if (mapping) return mapping.unified_id;

    // Create new unified ID
    const unifiedId = crypto.randomUUID();
    await this.db.insert("user_mappings", {
      channel,
      channel_user_id: channelUserId,
      unified_id: unifiedId,
    });

    return unifiedId;
  }

  /**
   * Link two channel identities to the same user.
   * "Hey, I'm also @john on Telegram" → links WhatsApp + Telegram.
   */
  async linkIdentities(
    unifiedId: string,
    channel: string,
    channelUserId: string
  ) {
    await this.db.insert("user_mappings", {
      channel,
      channel_user_id: channelUserId,
      unified_id: unifiedId,
    });
  }
}

Memory Host SDK (`packages/memory-host-sdk`)

The high-level memory system described above is powered under the hood by a dedicated package: packages/memory-host-sdk. This SDK is the production-grade semantic memory engine that ships with OpenClaw and handles vector embeddings, provider routing, persistent storage, and a custom query language.

What Problem It Solves

The built-in FactMemory and EpisodicMemory classes need a reliable, provider-agnostic way to generate embeddings, store vectors, and run similarity searches. memory-host-sdk provides a single engine that abstracts away eight different embedding backends behind a uniform interface, so the rest of the codebase never has to care whether embeddings come from OpenAI, a local Ollama model, or AWS Bedrock.

Architecture

graph TB
    subgraph SDK["memory-host-sdk"]
        ENGINE[engine.ts<br/>MemoryEngine]
        EMB[engine-embeddings.ts<br/>EmbeddingRouter]
        QMD[engine-qmd.ts<br/>QMD Query Parser]
        STORE[engine-storage.ts<br/>SQLite + sqlite-vec]
        FOUND[engine-foundation.ts<br/>Schema & Migrations]
    end

    subgraph Backends["Embedding Backends (packages/memory-host-sdk/src/host/)"]
        OAI[embeddings-openai.ts]
        VOY[embeddings-voyage.ts]
        GEM[embeddings-gemini.ts]
        OLL[embeddings-ollama.ts]
        MIS[embeddings-mistral.ts]
        BED[embeddings-bedrock.ts]
        REM[embeddings-remote-provider.ts]
    end

    ENGINE --> EMB
    ENGINE --> QMD
    ENGINE --> STORE
    STORE --> FOUND
    EMB --> OAI
    EMB --> VOY
    EMB --> GEM
    EMB --> OLL
    EMB --> MIS
    EMB --> BED
    EMB --> REM

Key Source Files

File	Role
`src/engine.ts`	Top-level `MemoryEngine` — entry point for all memory operations
`src/engine-embeddings.ts`	Routes embedding requests to the configured backend
`src/engine-qmd.ts`	Parses and executes QMD queries against the vector store
`src/engine-storage.ts`	SQLite + `sqlite-vec` for vector persistence
`src/engine-foundation.ts`	Schema definitions and migration helpers
`src/host/embeddings-openai.ts`	OpenAI `text-embedding-3-*` backend
`src/host/embeddings-voyage.ts`	Voyage AI backend (best for code/technical content)
`src/host/embeddings-bedrock.ts`	AWS Bedrock Titan/Cohere backends
`src/host/embeddings-ollama.ts`	Local Ollama backend (fully offline)
`src/host/qmd-query-parser.ts`	QMD lexer + parser
`src/host/qmd-process.ts`	QMD execution engine
`src/host/session-files.ts`	Per-session memory file layout on disk

QMD: Query Memory Description Language

QMD is a purpose-built query syntax for filtering memories by metadata and running vector similarity searches in one pass. It avoids the impedance mismatch of writing raw SQL against a vector table:

// Example QMD queries
const results = await engine.query(
  // Semantic search scoped to a category and recency window
  "remember my preferences FOR:preference SINCE:30d LIMIT:10"
);

const codeMemories = await engine.query(
  // Find memories tagged for a specific project with min similarity
  "typescript patterns FOR:technical TAG:project-alpha SIM:0.75"
);

QMD expressions are parsed in src/host/qmd-query-parser.ts, compiled to a query plan in src/host/qmd-scope.ts, and executed against the SQLite vector store in src/host/qmd-process.ts.

Configuring the Embedding Backend

// In openclaw config (config.yaml)
memory:
  embedding_provider: voyage        # openai | voyage | gemini | ollama | mistral | bedrock | remote
  embedding_model: voyage-3-large
  voyage_api_key: ${VOYAGE_API_KEY}

# For fully local/offline operation:
memory:
  embedding_provider: ollama
  embedding_model: nomic-embed-text
  ollama_base_url: http://localhost:11434

The backend-config.ts module validates provider config at startup and fails fast if required credentials are missing, preventing silent fallback to lower-quality embeddings.

When to Use This Directly

Most users never touch memory-host-sdk directly — OpenClaw wires it automatically. Reach for it directly when:

Building a custom memory backend or alternative storage layer
Writing integration tests that need to verify embedding quality
Extending the QMD language with project-specific filter clauses
Migrating to a new embedding provider (swap one backend file, everything else stays the same)

Summary

Concept	Key Takeaway
Context Window	Recent messages kept verbatim; older ones compressed into summaries
Fact Memory	Discrete facts about users with semantic embeddings and decay
Episodic Memory	Conversation summaries created when sessions end or idle
Session Modes	Main (1:1), Group (isolated), Queue (sequential)
Session Persistence	Sessions survive restarts via disk-backed store
Group Isolation	Sandboxed execution, restricted tools, no cross-user memory
Memory Retrieval	Parallel semantic search across facts and episodes
Memory Decay	Unused memories gradually fade; high-importance ones persist
Cross-Channel	Unified user identity — memories shared across all channels

Next Steps: Chapter 6: Skills & Tools — Explore OpenClaw's extensible skill platform, built-in tools, browser control, and Live Canvas.

Built with insights from the OpenClaw repository and community documentation.

What Problem Does This Solve?

Most teams struggle here because the hard part is not writing more code, but deciding clear boundaries for session, fact, summary so behavior stays predictable as complexity grows.

In practical terms, this chapter helps you avoid three common failures:

coupling core logic too tightly to one implementation path
missing the handoff boundaries between setup, execution, and validation
shipping changes without clear rollback or observability strategy

After working through this chapter, you should be able to reason about Chapter 5: Memory & Sessions as an operating subsystem inside OpenClaw: Deep Dive Tutorial, with explicit contracts for inputs, state transitions, and outputs.

Use the implementation notes around messages, window, query as your checklist when adapting these patterns to your own repository.

How it Works Under the Hood

Under the hood, Chapter 5: Memory & Sessions usually follows a repeatable control path:

Context bootstrap: initialize runtime config and prerequisites for session.
Input normalization: shape incoming data so fact receives stable contracts.
Core execution: run the main logic branch and propagate intermediate state through summary.
Policy and safety checks: enforce limits, auth scopes, and failure boundaries.
Output composition: return canonical result payloads for downstream consumers.
Operational telemetry: emit logs/metrics needed for debugging and performance tuning.

When debugging, walk this sequence in order and confirm each stage has explicit success/failure conditions.

Source Walkthrough

Use the following upstream sources to verify implementation details while reading this chapter:

OpenClaw Why it matters: authoritative reference on OpenClaw (github.com).

Suggested trace strategy:

search upstream code for session and fact to map concrete implementation paths
compare docs claims against actual runtime/config code before reusing patterns in production

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chapter 5: Memory & Sessions

Introduction

Memory Architecture

Memory Types

Short-Term: Context Window

Long-Term: Fact Memory

Long-Term: Episodic Memory

Session Management

Session Types

Session Persistence

Session Isolation (Groups)

Memory Retrieval Pipeline

Memory Lifecycle

Memory Decay

Cross-Channel Memory

Memory Host SDK (`packages/memory-host-sdk`)

What Problem It Solves

Architecture

Key Source Files

QMD: Query Memory Description Language

Configuring the Embedding Backend

When to Use This Directly

Summary

What Problem Does This Solve?

How it Works Under the Hood

Source Walkthrough

Chapter Connections

FilesExpand file tree

05-memory-sessions.md

Latest commit

History

05-memory-sessions.md

File metadata and controls

Chapter 5: Memory & Sessions

Introduction

Memory Architecture

Memory Types

Short-Term: Context Window

Long-Term: Fact Memory

Long-Term: Episodic Memory

Session Management

Session Types

Session Persistence

Session Isolation (Groups)

Memory Retrieval Pipeline

Memory Lifecycle

Memory Decay

Cross-Channel Memory

Memory Host SDK (packages/memory-host-sdk)

What Problem It Solves

Architecture

Key Source Files

QMD: Query Memory Description Language

Configuring the Embedding Backend

When to Use This Directly

Summary

What Problem Does This Solve?

How it Works Under the Hood

Source Walkthrough

Chapter Connections

Memory Host SDK (`packages/memory-host-sdk`)