MygramDB

In-memory full-text search engine with MySQL binlog replication. Sub-millisecond queries on million-row datasets.

Why MygramDB?

MySQL FULLTEXT scans B-tree pages on disk and struggles with common terms and concurrent load. MygramDB keeps a compressed n-gram index entirely in memory, syncing via GTID binlog replication.

Performance

Benchmarked on 1.1M Wikipedia articles (EN + JA), MygramDB v1.5.0 vs MySQL 8.4 FULLTEXT with ngram parser:

Query Type	MySQL	MygramDB	Speedup
Search (SORT id LIMIT 100)	507–2,566ms	0.08–0.42ms	1,200–6,700x
CJK search (Japanese bi-gram)	4–1,204ms	1–4ms	2–1,100x
COUNT	416–1,797ms	0.08ms	5,500–21,600x
Concurrent (4 connections)	8 QPS	11,766 QPS	1,400x

Sub-millisecond latency for most queries, no cache warmup needed
v1.5.0 verify_text eliminates n-gram false positives (exact match with MySQL results)
Reproducible: make bench-up && make bench-run (details)

Quick Start

Docker (Production Ready)

Prerequisites: Ensure MySQL has GTID mode enabled:

-- Check GTID mode (should be ON)
SHOW VARIABLES LIKE 'gtid_mode';

-- If OFF, enable GTID mode (MySQL 8.0+ / 9.x)
SET GLOBAL enforce_gtid_consistency = ON;
SET GLOBAL gtid_mode = OFF_PERMISSIVE;
SET GLOBAL gtid_mode = ON_PERMISSIVE;
SET GLOBAL gtid_mode = ON;

Start MygramDB:

docker run -d --name mygramdb \
  -p 11016:11016 \
  -e MYSQL_HOST=your-mysql-host \
  -e MYSQL_USER=repl_user \
  -e MYSQL_PASSWORD=your_password \
  -e MYSQL_DATABASE=mydb \
  -e TABLE_NAME=articles \
  -e TABLE_PRIMARY_KEY=id \
  -e TABLE_TEXT_COLUMN=content \
  -e TABLE_NGRAM_SIZE=2 \
  -e REPLICATION_SERVER_ID=12345 \
  -e NETWORK_ALLOW_CIDRS=0.0.0.0/0 \
  ghcr.io/libraz/mygram-db:latest

# Check logs
docker logs -f mygramdb

# Trigger initial data sync (required on first start)
docker exec mygramdb mygram-cli -p 11016 SYNC articles

# Try a search
docker exec mygramdb mygram-cli -p 11016 SEARCH articles "hello world"

Security Note: NETWORK_ALLOW_CIDRS=0.0.0.0/0 allows connections from any IP address. For production, restrict to specific IP ranges:

# Production example: Allow only from application servers
-e NETWORK_ALLOW_CIDRS=10.0.0.0/8,172.16.0.0/12

Docker Compose (with Test MySQL)

git clone https://github.com/libraz/mygram-db.git
cd mygram-db
docker-compose up -d

# Wait for MySQL to be ready (check with docker-compose logs -f)

# Trigger initial data sync
docker-compose exec mygramdb mygram-cli -p 11016 SYNC articles

# Try searching
docker-compose exec mygramdb mygram-cli -p 11016 SEARCH articles "hello"

Includes MySQL 8.4 with sample data for instant testing. Also tested with MySQL 9.6.

Basic Usage

# Search with pagination
SEARCH articles "hello world" SORT id LIMIT 100

# Sort by custom column
SEARCH articles "hello" SORT created_at DESC LIMIT 50

# LIMIT with offset (MySQL-style)
SEARCH articles "tech" LIMIT 10,100  # offset=10, count=100

# Count matches
COUNT articles "hello world"

# Multi-term AND search
SEARCH articles hello AND world

# With filters
SEARCH articles tech FILTER status=1 LIMIT 100

# Get by primary key
GET articles 12345

See Protocol Reference for all commands.

Features

Fast: Sub-millisecond search on million-row datasets
MySQL Replication: Real-time GTID-based binlog streaming
Runtime Variables: MySQL-style SET/SHOW VARIABLES for zero-downtime config changes
MySQL Failover: Switch MySQL servers at runtime with GTID position preservation
Multiple Tables: Index multiple tables in one instance
Dual Protocol: TCP (memcached-style) and HTTP/REST API
High Concurrency: Thread pool supporting 10,000+ connections
Unicode: ICU-based normalization for CJK/multilingual text
Compression: Hybrid Delta encoding + Roaring bitmaps
Easy Deploy: Single binary or Docker container

Architecture

graph LR
    MySQL[MySQL Primary] -->|binlog GTID| MygramDB1[MygramDB #1]
    MySQL -->|binlog GTID| MygramDB2[MygramDB #2]

    MygramDB1 -->|Search| App[Application]
    MygramDB2 -->|Search| App
    App -->|Write| MySQL

MygramDB acts as a specialized read replica for full-text search, while MySQL handles writes and normal queries.

When to Use MygramDB

✅ Good fit:

Search-heavy workloads (read >> write)
Millions of documents with full-text search
Need sub-100ms search latency
Simple deployment requirements
Japanese/CJK text with ngrams

❌ Not recommended:

Write-heavy workloads
Dataset doesn't fit in RAM (~1-2GB per million docs)
Need distributed search across nodes
Complex aggregations/analytics

Documentation

CHANGELOG - Version history and release notes
Docker Deployment Guide - Production Docker setup
Configuration Guide - All configuration options
Protocol Reference - Complete command reference
HTTP API Reference - REST API documentation
Performance Guide - Benchmarks and optimization
Replication Guide - MySQL replication setup
Operations Guide - Runtime variables and MySQL failover
Installation Guide - Build from source
Development Guide - Contributing guidelines
Client Library - C/C++ client library

Release Notes

Latest Release - Download binaries
Detailed Release Notes - Version-specific migration guides

Requirements

System:

RAM: ~1-2GB per million documents
OS: Linux or macOS

MySQL:

Version: 8.4+ / 9.x (tested with 8.4 and 9.6)
GTID mode enabled (gtid_mode=ON)
Binary log format: ROW (binlog_format=ROW)
Replication privileges: REPLICATION SLAVE, REPLICATION CLIENT

See Installation Guide for details.

License

MIT License

Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

For development environment setup, see Development Guide.

Authors

libraz libraz@libraz.net

Related Projects

mysql-event-stream - Standalone MySQL CDC library extracted from MygramDB's replication layer
go-mygram-client - Go client library
node-mygramdb-client - Node.js client library (npm)
python-mygramdb-client - Python client library

Acknowledgments

Roaring Bitmaps for compressed bitmaps
ICU for Unicode support
spdlog for logging
yaml-cpp for configuration

Name		Name	Last commit message	Last commit date
Latest commit History 204 Commits
.github		.github
.vscode		.vscode
docs		docs
e2e		e2e
examples		examples
src		src
support		support
tests		tests
third_party		third_party
.actrc		.actrc
.clang-format		.clang-format
.clang-tidy		.clang-tidy
.clangd		.clangd
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CMakeLists.txt		CMakeLists.txt
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
README_ja.md		README_ja.md
codecov.yml		codecov.yml
docker-compose.bench.yml		docker-compose.bench.yml
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MygramDB

Why MygramDB?

Performance

Quick Start

Docker (Production Ready)

Docker Compose (with Test MySQL)

Basic Usage

Features

Architecture

When to Use MygramDB

Documentation

Release Notes

Requirements

License

Contributing

Authors

Related Projects

Acknowledgments

About

Uh oh!

Releases 22

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MygramDB

Why MygramDB?

Performance

Quick Start

Docker (Production Ready)

Docker Compose (with Test MySQL)

Basic Usage

Features

Architecture

When to Use MygramDB

Documentation

Release Notes

Requirements

License

Contributing

Authors

Related Projects

Acknowledgments

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 22

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages