🤖 AI Summary
To address memory constraints, lengthy I/O paths, storage-granularity mismatches, and excessive indexing overhead in large-scale vector search—particularly for SSD-based approximate nearest neighbor search (ANNS)—this paper proposes PageANN, the first SSD-oriented, page-aligned graph indexing framework. Its core innovation lies in a page-node graph structure that strictly aligns logical graph nodes with SSD physical pages, coupled with clustering-aware storage, topology compression, representative-vector merging, and cooperative memory management to achieve I/O-efficient data layout and lightweight indexing. Evaluated across multiple datasets and varying memory budgets, PageANN achieves 1.85×–10.83× higher throughput and reduces latency by 51.7%–91.9% compared to state-of-the-art methods, while maintaining high recall. This significantly enhances scalability and efficiency of disk-native vector retrieval.
📝 Abstract
Approximate Nearest Neighbor Search (ANNS), as the core of vector databases (VectorDBs), has become widely used in modern AI and ML systems, powering applications from information retrieval to bio-informatics. While graph-based ANNS methods achieve high query efficiency, their scalability is constrained by the available host memory. Recent disk-based ANNS approaches mitigate memory usage by offloading data to Solid-State Drives (SSDs). However, they still suffer from issues such as long I/O traversal path, misalignment with storage I/O granularity, and high in-memory indexing overhead, leading to significant I/O latency and ultimately limiting scalability for large-scale vector search.
In this paper, we propose PageANN, a disk-based approximate nearest neighbor search (ANNS) framework designed for high performance and scalability. PageANN introduces a page-node graph structure that aligns logical graph nodes with physical SSD pages, thereby shortening I/O traversal paths and reducing I/O operations. Specifically, similar vectors are clustered into page nodes, and a co-designed disk data layout leverages this structure with a merging technique to store only representative vectors and topology information, avoiding unnecessary reads. To further improve efficiency, we design a memory management strategy that combines lightweight indexing with coordinated memory-disk data allocation, maximizing host memory utilization while minimizing query latency and storage overhead. Experimental results show that PageANN significantly outperforms state-of-the-art (SOTA) disk-based ANNS methods, achieving 1.85x-10.83x higher throughput and 51.7%-91.9% lower latency across different datasets and memory budgets, while maintaining comparable high recall accuracy.