



SPANN is a highly efficient billion-scale ANN search system using clustered HNSW indexes with dynamic partitioning for balanced load. Key features: disk-based, high recall, low latency on commodity hardware. Use cases: web-scale recommendation, image retrieval. Improves on DiskANN with better build time; competitive FAISS GPU in CPU perf.
Loading more......
SPANN (Highly-efficient Billion-scale Approximate Nearest Neighbor Search) is a memory-disk hybrid vector indexing and search system developed by Microsoft Research that follows the inverted index methodology.
SPANN stores only the centroid points of posting lists in memory while putting the large posting lists on disk. This hybrid approach enables efficient billion-scale vector search with reduced memory requirements.
SPANN is an on-disk cluster-based index that stores clusters on disk and maintains a small graph in memory to index the nearest clusters. The system is designed for applications requiring billion-scale vector search with memory efficiency.
Published by Microsoft Research, SPANN represents a significant advancement in disk-based vector indexing technologies and serves as a benchmark for hybrid vector search systems.