Memory in NVIDIA NeMo Agent Toolkit

The NeMo Agent Toolkit Memory subsystem is designed to store and retrieve a user's conversation history, preferences, and other "long-term memory." This is especially useful for building stateful LLM-based applications that recall user-specific data or interactions across multiple steps.

The memory module is designed to be extensible, allowing developers to create custom memory back-ends, providers in NeMo Agent Toolkit terminology.

Included Memory Modules

The NeMo Agent Toolkit includes three memory module providers, all of which are available as plugins:

Mem0 which is provided by the nvidia-nat-mem0ai plugin.
Redis which is provided by the nvidia-nat-redis plugin.
Zep which is provided by the nvidia-nat-zep-cloud plugin (Zep NVIDIA NeMo documentation).

Automatic Memory Wrapper Agent

The NeMo Agent Toolkit provides an auto_memory_agent wrapper that adds automatic memory capture and retrieval to any agent without requiring the LLM to invoke memory tools explicitly.

Why Use Automatic Memory?

Traditional tool-based memory:

LLMs may forget to call memory tools
Memory capture is inconsistent
Requires explicit memory tool configuration

Automatic memory wrapper agent:

Guaranteed capture: User messages and agent responses are automatically stored
Automatic retrieval: Relevant context is injected before each agent call
Memory backend agnostic: Works with Zep, Mem0, Redis, or any MemoryEditor
Universal compatibility: Wraps any agent type (ReAct, ReWOO, Tool Calling, etc.)

Quick Start

To use automatic memory, wrap any agent with the auto_memory_agent workflow type:

memory:
  zep_memory:
    _type: nat.plugins.zep_cloud/zep_memory

functions:
  my_react_agent:
    _type: react_agent
    llm_name: nim_llm
    tool_names: [calculator]

workflow:
  _type: auto_memory_agent
  inner_agent_name: my_react_agent
  memory_name: zep_memory
  llm_name: nim_llm

Configuration Options

The automatic memory wrapper agent supports several configuration parameters:

Required Parameters:

inner_agent_name: Name of the agent to wrap with automatic memory
memory_name: Name of the memory backend (from memory: section)
llm_name: LLM to use (required by AgentBaseConfig)

Optional Feature Flags (all default to true):

save_user_messages_to_memory: Automatically save user messages before agent processing
retrieve_memory_for_every_response: Automatically retrieve and inject memory context
save_ai_messages_to_memory: Automatically save agent responses after generation

Memory Backend Parameters:

search_params: Passed to memory_editor.search() (e.g., mode, top_k)
add_params: Passed to memory_editor.add_items() (e.g., ignore_roles)

Multi-Tenant Memory Isolation

User ID is automatically extracted at runtime for memory isolation via:

user_manager.get_id() - For production with custom auth middleware (recommended)
X-User-ID HTTP header - For testing without middleware
"default_user" - Fallback for local development

For detailed configuration and usage examples, refer to the examples/agents/auto_memory_wrapper/README.md guide.

Examples

The following examples in the repository demonstrate how to use the memory module in the NeMo Agent Toolkit:

examples/agents/auto_memory_wrapper - Automatic memory wrapper agent for any agent
examples/memory/redis - Basic long-term memory using Redis
examples/frameworks/semantic_kernel_demo - Multi-agent system with long-term memory
examples/RAG/simple_rag - RAG system with Mem0 memory

Additional Resources

For information on how to write a new memory module provider can be found in the Adding a Memory Provider document.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory in NVIDIA NeMo Agent Toolkit

Included Memory Modules

Automatic Memory Wrapper Agent

Why Use Automatic Memory?

Quick Start

Configuration Options

Multi-Tenant Memory Isolation

Examples

Additional Resources

FilesExpand file tree

memory.md

Latest commit

History

memory.md

File metadata and controls

Memory in NVIDIA NeMo Agent Toolkit

Included Memory Modules

Automatic Memory Wrapper Agent

Why Use Automatic Memory?

Quick Start

Configuration Options

Multi-Tenant Memory Isolation

Examples

Additional Resources