SpatialPathDB

Locality-Aware Partitioned Storage for Scalable Spatial Querying in Digital Pathology

Key Results (42.1M nuclei, 29 TCGA BLCA slides)

Metric	Value
Q1 Viewport (SPDB p50)	63 ms
Q1 Viewport (Mono p50)	159 ms
Improvement over Mono	2.5x
Cold-cache improvement	9.1x (53 ms vs 486 ms)
Partition pruning rate	89% (5.7 of ~59 sub-partitions scanned)
Hilbert vs Z-order	Hilbert 28% faster
kNN (k=50, SPDB)	15 ms p50
Mixed workload throughput	16.4 QPS
Max concurrent throughput (SO)	65 QPS @ 16 clients

Architecture

┌─────────────────────────────────────────┐
│             Application Layer           │
│   (Hilbert key range computation)       │
├─────────────────────────────────────────┤
│           PostgreSQL 17 / PostGIS 3.6   │
├─────────────────────────────────────────┤
│  Level 1: LIST(slide_id)  [29 slides]   │
│  Level 2: RANGE(hilbert_key) [~30/slide]│
│  = 857 leaf partitions                  │
├─────────────────────────────────────────┤
│  Per-partition hybrid indexes:          │
│  - GiST(geom)                          │
│  - B-tree(slide_id, hilbert_key)        │
│  - B-tree(slide_id, class_label)        │
│  - B-tree(tile_id)                      │
└─────────────────────────────────────────┘

Project Structure

spdb/
  config.py          - Configuration (DB, Hilbert order, benchmark params)
  hilbert.py         - Vectorized Hilbert curve encoder/decoder
  zorder.py          - Z-order (Morton) curve encoder
  schema.py          - Schema creation for all 5 DB configurations
  ingest.py          - Data pipeline: HuggingFace → transform → COPY
  pruning_model.py   - Formal pruning model
  adaptive.py        - Adaptive repartitioning prototype

benchmarks/
  framework.py       - Timing, statistics, EXPLAIN parsing, I/O decomposition
  q1_viewport.py     - Q1: Viewport range queries (with Hilbert pruning)
  q2_knn.py          - Q2: k-Nearest Neighbor queries
  q3_aggregation.py  - Q3: Tile-level aggregation
  q4_spatial_join.py - Q4: Spatial join with phenotype filter
  concurrent.py      - Concurrent throughput benchmark (asyncpg)
  extended.py        - Extended experiments (12 experiments total)

visualization/
  figures.py         - Publication-ready Matplotlib figures
  tables.py          - LaTeX table generation

paper/
  spdb.tex           - Full LaTeX paper

results/
  raw/               - JSON results + CSV raw latencies
  figures/           - PDF/PNG figures
  tables/            - LaTeX table fragments

Quick Start

# 1. Setup
python3 -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt

# 2. Create database
createdb spdb
psql -d spdb -c "CREATE EXTENSION postgis;"

# 3. Ingest data
python run_ingest.py

# 4. Run benchmarks
python run_benchmarks.py all

# 5. Generate figures
python run_benchmarks.py figures

Experiments

Q1-Q4: Core query benchmarks across all 5 configurations
Viewport sensitivity: 1%, 2%, 5%, 10%, 20% viewport fractions
Hilbert order sweep: p ∈ {6, 8, 10, 12}
Hilbert vs Z-order: Controlled comparison, identical partition structure
kNN k-sweep: k ∈ {10, 25, 50, 100, 200}
Concurrency: 1, 4, 16, 32, 64 concurrent clients
Workload mix: 70% Q1, 15% Q2, 10% Q3, 5% Q4
Storage overhead: Recursive partition size measurement
Density analysis: Per-slide object density distribution
Cold cache: Fresh connection per trial
Pruning analysis: EXPLAIN-based partition scan counting

Citation

@article{yamani2026spatialpathdb,
  title={SpatialPathDB: Locality-Aware Partitioned Storage for
         Scalable Spatial Querying in Digital Pathology},
  author={Yamani, Sai Asish},
  year={2026}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SpatialPathDB

Key Results (42.1M nuclei, 29 TCGA BLCA slides)

Architecture

Project Structure

Quick Start

Experiments

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
benchmarks		benchmarks
deploy		deploy
paper		paper
results		results
spdb		spdb
visualization		visualization
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
run_benchmarks.py		run_benchmarks.py
run_ingest.py		run_ingest.py
so_viewport_bench.py		so_viewport_bench.py

Folders and files

Latest commit

History

Repository files navigation

SpatialPathDB

Key Results (42.1M nuclei, 29 TCGA BLCA slides)

Architecture

Project Structure

Quick Start

Experiments

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages