tiny-bitcask is a small Bitcask-style key/value store in Go: one active append-only data file per directory, an in-memory keydir (hash map) pointing at the latest record per key, and optional merge to drop stale records and reclaim space.
Add this module to your go.mod.
go get github.com/elliotchenzichang/tiny-bitcask@latest
From the repository root you can run the bundled example:
go run ./cmd/demoMinimal program: open a store on a directory, Set / Get bytes, then Close.
package main
import (
"fmt"
"log"
"os"
"path/filepath"
"tiny-bitcask"
)
func main() {
parent, err := os.MkdirTemp("", "tb")
if err != nil {
log.Fatal(err)
}
defer os.RemoveAll(parent)
dir := filepath.Join(parent, "data") // must not exist yet — see NewDB / recovery
o := *tiny_bitcask.DefaultOptions
o.Dir = dir
db, err := tiny_bitcask.NewDB(&o)
if err != nil {
log.Fatal(err)
}
defer db.Close()
if err := db.Set([]byte("demo-key"), []byte("demo-value")); err != nil {
log.Fatal(err)
}
v, err := db.Get([]byte("demo-key"))
if err != nil {
log.Fatal(err)
}
fmt.Println(string(v))
}The same source lives at cmd/demo/main.go. Could clone this repo and give it a try.
| Idea | Role in this project |
|---|---|
| Append-only active file | Writes go to ActiveFile; when size exceeds SegmentSize, the file is sealed and a new active file is opened (storage/datafiles.go). |
| Hint files | After rotation, each sealed fid.dat can have a compact fid.hint for faster recovery; invalid or missing hints fall back to scanning the data file (storage/hint.go, db.go recovery). |
| Keydir | index.KeyDir maps string key → DataPosition (file id, offset, key/value sizes, timestamp). |
| Read path | One hash lookup + one ReadAt by (fid, offset, length); optional CRC32 verification on read (Options.VerifyCRC, default true). |
| Merge / compaction | Scans immutable files and rewrites entries that are still the live version into the active file, then deletes merged files (DB.Merge). Live vs. stale is decided by comparing the keydir’s (fid, offset) to the start offset of each record while scanning. |
| Tombstone delete | Deletes append a record with DeleteFlag; the key is written in the record for recovery; the key is removed from the keydir (DB.Delete). |
High-level layout: the DB type ties together options, an in-memory keydir, segment files (active + sealed), optional hint sidecars for recovery, and an advisory lock. entity defines the on-disk record layout shared by writes, reads, hints, and merge.
flowchart TB
subgraph Client["Client"]
App["Application / cmd/demo"]
end
subgraph API["Package root"]
DB["DB · Get / Set / Delete / Merge / recovery"]
Opt["Options"]
Lk["Advisory lock · lock_unix / lock_other"]
end
subgraph Idx["index/"]
KD["KeyDir · key → DataPosition"]
end
subgraph St["storage/"]
DF["DataFiles"]
Act["ActiveFile · append-only segment"]
Old["OldFile · sealed segments"]
Hnt["Hint files · read/write"]
end
subgraph Ent["entity/"]
Rec["Entry · Meta · CRC · tombstone"]
end
subgraph Disk["Data directory on disk"]
Dat["*.dat"]
Hin["*.hint"]
Lf[".tiny-bitcask.lock"]
end
App --> DB
DB --> Opt
DB --> KD
DB --> DF
DB --> Lk
Lk --> Lf
DF --> Act
DF --> Old
DF --> Hnt
Act --> Dat
Old --> Dat
Hnt --> Hin
Rec -.->|encode / decode / verify| Act
Rec -.->|encode / decode / verify| Old
Rec -.->|compact index rows| Hnt
Typical paths (mental model). Write: append an Entry to the active .dat, then update KeyDir. Read: KeyDir lookup → ReadAt on the segment identified by fid. Open: scan segments (or load hints) to rebuild KeyDir. Merge: scan sealed segments for live rows only, rewrite into the active file, drop old .dat / .hint pairs.
- Open / create:
NewDB— empty directory creates a new store; existing directory recovers the keydir by scanning*.datfiles in order (or hints for sealed segments).Options:VerifyCRC(default on),ReadOnly(open existing store read-only),ExclusiveLock(Unix advisoryflockon.tiny-bitcask.lock; shared lock whenReadOnly). - Hint files: On segment rotation, a compact
fid.hintis written next to the sealedfid.dat(atomic write). Hint entries omit values; tombstone records are skipped. When merge removes an old segment, the matching.hintis removed with it. - Put / Get / Delete: basic APIs with a process-wide
RWMutex. - ListKeys / Fold:
ListKeysreturns keys sorted lexicographically;Foldwalks keys in that order and reads each value (read lock held for the scan). - Sync / Close:
DB.Syncfsyncs the active segment;DB.Closesyncs then closes segment files and releases the lock file handle. - Segment rotation: configurable
Options.SegmentSize(default 256 MiB). - On-disk record layout: fixed meta (CRC32, timestamps, sizes, flag) + key + value (
entity/entry.go). Tombstone records store the key withValueSize0. - Merge: rewrites live entries from old segments and removes merged files; tombstone records in old files are skipped during merge.
- CRC on read: Enabled by default; disable with
Options.VerifyCRC = falseif needed. - Recovery: Full segment scans apply tombstones in order (remove key from keydir) and populate
DataPosition.Timestampfrom record meta; hint recovery skips tombstone rows. - Tests:
db_test.gocovers CRUD, rotation, merge, delete+merge, hint recovery, merge after reopen, tombstone recovery, CRC failure, ListKeys/Fold, read-only open;storage/hint_test.goandentity/entry_test.gocover hint encoding and CRC/tombstones (requiresgithub.com/stretchr/testify).
Some items from bitcask-intro.pdf and typical production engines are still out of scope or partial:
- Merge + hint generation — Merge rewrites live data into the active file and deletes merged segments (and their hints); a new hint is written only when that segment is rotated (normal append-only behavior).
- Portability — Advisory locking is implemented on Unix (
flock). On other platforms the lock is a no-op; use a single process or external coordination. - API breadth — No key prefix / range iterator beyond sorted
ListKeys+Fold. No snapshot or MVCC reads. - Durability policy —
Syncis explicit; there is noSyncafter every write (callSyncwhen you need durability beyond process crash).
| Path | Purpose |
|---|---|
db.go |
NewDB, Get / Set / Delete, Merge, ListKeys, Fold, Sync, Close, recovery |
lock_unix.go, lock_other.go |
Optional advisory DB lock |
index/index.go |
Keydir (map + DataPosition) |
storage/datafiles.go |
Active/old files, rotation, read/write entries, CRC, Sync/Close |
storage/hint.go |
Hint file format, write on rotation, read/remove with segments |
entity/entry.go |
Binary encoding of records, tombstones, VerifyRecordCRC |
options.go |
Dir, SegmentSize, VerifyCRC, ReadOnly, ExclusiveLock |