A page-level approximate nearest neighbor search system optimized for SSD-based workloads.
PageANN extends Microsoft's DiskANN with page-level graph organization to reduce random I/O and improve SSD utilization during vector search.
- Page-level indexing: Multiple vectors are merged into disk-aligned pages
- Reduced I/O: Fewer random accesses during graph traversal
- SSD-optimized: Better utilization of SSD bandwidth and parallelism
If you use PageANN in your research, please cite our paper:
Scalable Disk-Based Approximate Nearest Neighbor Search with Page-Aligned Graph
@article{kang2025pageann,
title={Scalable Disk-Based Approximate Nearest Neighbor Search with Page-Aligned Graph},
author={Kang, Dingyi and Jiang, Dongming and Yang, Hanshen and Liu, Hang and Li, Bingzhe},
journal={arXiv preprint arXiv:2509.25487},
year={2025},
url={https://www.arxiv.org/abs/2509.25487}
}This project is built upon Microsoft Research's DiskANN:
- DiskANN repository: https://github.com/microsoft/DiskANN
- Original paper: Subramanya et al., "DiskANN: Fast Accurate Billion-point Nearest Neighbor Search on a Single Node", NeurIPS 2019
Original DiskANN: Copyright (c) Microsoft Corporation PageANN Modifications: Copyright (c) 2025 Dingyi Kang
Dingyi Kang Email: dingyikangosu@gmail.com
MIT License - see LICENSE file for details.
This project contains:
- DiskANN components: Copyright (c) Microsoft Corporation
- PageANN modifications: Copyright (c) 2025 Dingyi Kang
Linux (Ubuntu 20.04+):
sudo apt install make cmake g++ libaio-dev libgoogle-perftools-dev clang-format libboost-all-dev
sudo apt install libmkl-full-devEarlier Ubuntu versions: Install Intel MKL manually from oneAPI MKL installer
mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
make -jFor detailed workflow and command-line instructions, see:
- PageANN Usage Guide: Complete workflow for building and searching PageANN indexes
- Paper Experiments Parameters: Exact parameters to reproduce results from our paper
build_vamana_disk_index: Build Vamana vector-level disk indexgenerate_page_graph: Convert Vamana index to page-level graphrecommend_vamana_graph_degree: Recommend optimal graph degree parameterssearch_disk_index: Search the page-level index
compute_groundtruth: Compute ground truth for recall evaluationgenerate_hash_buckets: Generate hash buckets for routinggenerate_reorder_pq: Regenerate PQ with different compression levels
Contributions are welcome! Please feel free to submit issues or pull requests.
Special thanks to Microsoft Research for open-sourcing DiskANN, which forms the foundation of this work.