SOAR

SOAR is a set of improved algorithms on top of ScaNN that accelerate vector search by introducing controlled redundancy and multi-cluster assignment, enabling faster approximate nearest neighbor retrieval with smaller indexes in large‑scale vector databases and search systems.

Visit Website

Surveys

Loading more......

Information

Websiteresearch.google

PublishedDec 25, 2025

Tags

3 Items

#ann #vector-search #optimization

Similar Products

Optimized Product Quantization (OPQ)

Optimized Product Quantization (OPQ) enhances Product Quantization by optimizing space decomposition and codebooks, leading to lower quantization distortion and higher accuracy in vector search. OPQ is widely used in advanced vector databases for improving recall and search quality.

000

BANG

BANG is a billion-scale approximate nearest neighbor search system optimized for single GPU execution, enabling high-performance vector search in vector database environments at massive scale.

000

Efficient and Robust Approximate Nearest Neighbor Search Using Hierarchical Navigable Small World Graphs

This paper introduces the HNSW algorithm, which is widely adopted in vector databases and search engines for its efficient and robust performance on high-dimensional data. HNSW is foundational in powering modern vector search systems.

000

Li, Wen, et al. "Approximate nearest neighbor search on high dimensional data—experiments, analyses, and improvement."

An influential paper analyzing and improving approximate nearest neighbor search methods for high-dimensional data, highly relevant for developing and understanding vector databases.

000

NGT

NGT (Neighborhood Graph and Tree) is an open-source vector search engine designed for fast and scalable approximate nearest neighbor search.

000

GleanVec: Accelerating vector search with minimalist nonlinear dimensionality reduction

Paper by Tepper et al. proposing GleanVec, a method to accelerate vector search using minimalist nonlinear dimensionality reduction. Improves efficiency for high-dimensional vector queries.

000

Description

SOAR (Spilling with Orthogonality-Amplified Residuals) is a set of improved algorithms built on top of the ScaNN vector search library. It focuses on accelerating approximate nearest neighbor (ANN) search over large-scale embedding datasets by introducing controlled redundancy and multi-cluster assignment. This design enables faster vector similarity search while keeping index sizes relatively small and preserving key index quality metrics.

Key Details

Domain: Approximate nearest neighbor (ANN) / vector similarity search

Primary use cases: Large-scale embedding search for ML applications (e.g., image, web, media retrieval; retrieval-augmented generation systems)

Publication: “SOAR: Improved Indexing for Approximate Nearest Neighbor Search” (NeurIPS 2023)

Underlying system: Extends and enhances the ScaNN open-source vector search library

Features

Improved indexing for ANN search

Introduces a new indexing approach (SOAR) that refines how ScaNN structures and searches large embedding datasets.

Controlled redundancy in the vector index

Adds mathematically designed redundancy to the index to improve search efficiency.
Redundancy is engineered to have minimal impact on overall index size and other index metrics.

Multi-cluster assignment

Assigns vectors to multiple clusters (multi-cluster assignment) to increase the likelihood that true nearest neighbors are retrieved quickly.
Supports faster approximate nearest neighbor retrieval, especially at large scales.

Orthogonality-amplified residuals (conceptual)

Uses residual-based techniques that emphasize orthogonality properties to make redundancy more effective for search.
Designed to complement ScaNN’s existing partitioning and scoring mechanisms.

Faster vector search at scale

Targets large-scale deployments where brute-force search is infeasible.
Aims to reduce latency and computation per query for vector similarity search workloads.

Smaller, efficient indexes

Seeks a balance between added redundancy and compact index representation, keeping index growth modest while improving recall and performance.

Integration with ScaNN

Builds directly on the ScaNN library, which is already widely used and open-sourced.
Can be applied in settings where ScaNN is used for embedding-based retrieval within production ML systems.

SOAR

Information

Categories

Tags

Similar Products

Connect with us

Stay Updated

Product

Clients

Company

Resources

SOAR

Information

Categories

Tags

Similar Products

SOAR

Description

Key Details

Features

Applications

Pricing