Embenx — Agentic Memory Layer for Python AI Agents 🚀

The Agentic Memory Layer & Universal Retrieval Toolkit.
Synthetic data generation, 20+ vector backends, hybrid search, and MCP native memory for AI agents.

📖 Read the Docs · Explore the Visual UI · Report Bug · Request Feature

What is Embenx?

Embenx is a Python-native retrieval library that sits between raw vector indices and full-blown vector databases. It provides a high-level Collection API for managing embeddings and metadata, supporting advanced features like filtering, reranking, and quantization across 20+ backends.

🌟 New in v1.5.1: OpenSearch Integration

Embenx now natively supports OpenSearch as a vector backend. Scale your agentic memory to production clusters with native k-NN vector search and enterprise-grade durability.

Quickstart

Get up and running in 60 seconds.

Step 1 — Install

pip install embenx

Step 2 — Create a collection and add embeddings

import numpy as np
from embenx import Collection

# 768-dim FAISS-HNSW index (in-memory, no extra config needed)
col = Collection(dimension=768, indexer_type="faiss-hnsw")

vectors = np.random.rand(10, 768).astype("float32")
metadata = [{"id": i, "text": f"Document {i}"} for i in range(10)]
col.add(vectors, metadata)

Step 3 — Search

query = np.random.rand(768).astype("float32")
results = col.search(query, top_k=3)

for meta, dist in results:
    print(f"{meta['text']}  (distance: {dist:.4f})")

Library Usage

🚀 Production Deployment (OpenSearch, Qdrant, Milvus)

Embenx makes it easy to transition from local development to production-grade vector clusters.

from embenx import Collection

# Initialize with OpenSearch (Assumes http://localhost:9200)
# Use OPENSEARCH_URL env var to override
col = Collection(dimension=128, indexer_type="opensearch")

# Add data directly to OpenSearch
col.add(vectors, metadata)

# Search with native k-NN
results = col.search(query_vector, top_k=5)

🧠 Agentic Memory & Hybrid Search

Combine semantic search with keyword retrieval and self-healing feedback loops.

from embenx import Collection

# Initialize with hybrid search (FAISS + BM25)
col = Collection(dimension=768, indexer_type="faiss-hnsw", sparse_indexer_type="bm25")

# Hybrid Search using RRF (Reciprocal Rank Fusion)
results = col.hybrid_search(
    query_vector=query_vec,
    query_text="What is the capital of France?",
    top_k=5
)

# Self-healing feedback
col.feedback(doc_id="doc_123", label="good")

🧪 Synthetic Data Generation

Generate high-quality query-document pairs to train or evaluate your retrieval pipelines.

results = col.generate_synthetic_queries(
    n_queries_per_doc=2,
    num_docs=100,
    model="gpt-4o-mini",  # Or "ollama/llama3"
    output_path="eval_data.jsonl"
)

Agentic Memory (MCP)

Embenx ships with a built-in Model Context Protocol (MCP) server. This allows AI agents (like Claude Desktop) to use Embenx collections as their own long-term memory.

1. Start the server

embenx mcp-start

Visual Explorer

Embenx provides a built-in web UI to visualize your vector collections, including an interactive HNSW Graph Visualizer and a RAG Playground.

embenx explorer

Features

20+ Vector Backends — Native support for OpenSearch, Qdrant, Milvus, FAISS, PGVector, and more.
Synthetic Data Generation — Create high-quality query-document pairs using LLMs for training and evaluation.
Multimodal Support — Native support for image embeddings (CLIP).
RAG Playground — Test retrieval quality with an integrated LLM chat loop.
HNSW Graph Visualizer — Interactive 3D visualization of navigation layers.
Agentic Memory (MCP) — Native Model Context Protocol support for AI agents.
Self-Healing Retrieval — Integrated feedback loops to automatically improve ranking accuracy.
Temporal Memory (Echo) — Recency-biased retrieval and time-window filtering.
Spatial Memory (ESWM) — Neuroscience-inspired spatial cognitive maps for navigation.
Hybrid Search — Combine dense vectors with sparse BM25 retrieval using RRF.
Portable Formats — Native support for Parquet, NumPy (.npy/.npz), and FAISS (.index).

Supported Indexers

Indexer Key	Family / Algorithm	Best For
`opensearch`	OpenSearch	Native k-NN vector search (Production)
`faiss-hnsw`	FAISS HNSW	High-recall in-memory search
`qdrant`	Qdrant	Filtered vector search at scale
`milvus`	Milvus Cluster	Distributed production workloads
`pgvector`	PostgreSQL pgvector	Embeddings next to relational data
`elasticsearch`	Elasticsearch	Full-text + vector search combined
`scann`	ScaNN Tree-AH	State-of-the-art speed/recall (Linux)
`usearch`	USearch HNSW	High-performance C++, low latency
`hnswlib`	HNSWLib	Pure HNSW, easy to tune
`weaviate`	Weaviate	Multi-tenant, schema-driven search
`duckdb`	DuckDB	Analytical + vector hybrid queries
`lance`	LanceDB Columnar	Large disk-based datasets
`bm25`	BM25 (sparse)	Keyword / sparse retrieval baseline
`simple`	NumPy Exact	Exact search, zero dependencies

...and 8 more variants including quantized (PQ/SQ8) and half-precision (f16/i8) indices.

Installation

pip install embenx

License

Distributed under the MIT License. See LICENSE for more information.

Built with ❤️ for the AI engineering community by adityak74

Name		Name	Last commit message	Last commit date
Latest commit History 149 Commits
.github		.github
.planning		.planning
docs_src		docs_src
evals		evals
examples		examples
indexers		indexers
tests		tests
.coverage		.coverage
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
ROADMAP.md		ROADMAP.md
SECURITY.md		SECURITY.md
SKILL.md		SKILL.md
benchmark.py		benchmark.py
benchmark_report.md		benchmark_report.md
cli.py		cli.py
core.py		core.py
data.py		data.py
embenx.png		embenx.png
explorer.py		explorer.py
idea.md		idea.md
llm.py		llm.py
mcp_server.py		mcp_server.py
publish.sh		publish.sh
pyproject.toml		pyproject.toml
rerank.py		rerank.py
test_mcp.parquet		test_mcp.parquet
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Embenx — Agentic Memory Layer for Python AI Agents 🚀

What is Embenx?

🌟 New in v1.5.1: OpenSearch Integration

Quickstart

Library Usage

🚀 Production Deployment (OpenSearch, Qdrant, Milvus)

🧠 Agentic Memory & Hybrid Search

🧪 Synthetic Data Generation

Agentic Memory (MCP)

1. Start the server

Visual Explorer

Features

Supported Indexers

Installation

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Embenx — Agentic Memory Layer for Python AI Agents 🚀

What is Embenx?

🌟 New in v1.5.1: OpenSearch Integration

Quickstart

Library Usage

🚀 Production Deployment (OpenSearch, Qdrant, Milvus)

🧠 Agentic Memory & Hybrid Search

🧪 Synthetic Data Generation

Agentic Memory (MCP)

1. Start the server

Visual Explorer

Features

Supported Indexers

Installation

License

About

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages