



A major algorithmic advancement to Google's ScaNN that introduces controlled redundancy to the vector index, leading to improved search efficiency. Enables even faster vector search while maintaining or improving accuracy.
Loading more......
SOAR (Spilling with Orthogonality-Amplified Residuals) is a major algorithmic advancement introduced by Google Research to enhance ScaNN's vector search capabilities. It introduces controlled redundancy to the vector index, significantly improving search efficiency.
Traditional partitioning assigns each vector to exactly one partition. SOAR allows vectors to "spill" into multiple partitions, creating redundancy that improves recall without proportional increases in search cost.
Advantages:
Costs:
Available in Google's ScaNN library:
import scann
searcher = scann.scann_ops_pybind.builder(db, 10, "dot_product")
.tree(
num_leaves=2000,
num_leaves_to_search=100,
training_sample_size=250000)
.score_ah(
2,
anisotropic_quantization_threshold=0.2)
.reorder(100)
.build()
SOAR features are integrated into the tree and scoring components.
vs. Standard IVF: Much better recall-speed tradeoff vs. HNSW: Competitive performance with different characteristics vs. Original ScaNN: 2-3x faster at same accuracy
Published by Google Research, SOAR represents state-of-the-art in learned approximate search methods and has influenced subsequent research in efficient vector search.
Open-source as part of ScaNN library on GitHub under Apache 2.0 license.
Free and open-source as part of ScaNN.