• Home
  • Categories
  • Tags
  • Pricing
  • Submit
    Built with
    Ever Works
    Ever Works

    Connect with us

    Stay Updated

    Get the latest updates and exclusive content delivered to your inbox.

    Product

    • Categories
    • Tags
    • Pricing
    • Help

    Clients

    • Sign In
    • Register
    • Forgot password?

    Company

    • About Us
    • Admin
    • Sitemap

    Resources

    • Blog
    • Submit
    • API Documentation
    All product names, logos, and brands are the property of their respective owners. All company, product, and service names used in this repository, related repositories, and associated websites are for identification purposes only. The use of these names, logos, and brands does not imply endorsement, affiliation, or sponsorship. This directory may include content generated by artificial intelligence.
    Copyright © 2025 Awesome Vector Databases. All rights reserved.·Terms of Service·Privacy Policy·Cookies
    Decorative pattern
    Decorative pattern
    1. Home
    2. Research Papers & Surveys
    3. MUVERA

    MUVERA

    Multi-Vector Retrieval Algorithm that reduces multi-vector similarity search to single-vector similarity search via Fixed Dimensional Encodings. Achieves 10% improved recall with 90% lower latency compared to existing approaches.

    Surveys

    Loading more......

    Information

    Websiteresearch.google
    PublishedMar 26, 2026

    Categories

    1 Item
    Research Papers & Surveys

    Tags

    3 Items
    #Multi Vector#Google#Efficiency

    Similar Products

    6 result(s)

    ConstBERT

    Novel approach to reduce storage footprint of multi-vector retrieval by encoding each document with a fixed, smaller set of learned embeddings. Reduces index sizes by over 50% compared to ColBERT while retaining most effectiveness.

    Intelligence Per Watt

    Research metric from Stanford measuring AI model efficiency, showing local language models improved 5.3× from 2023 to 2025, handling 88.7% of single-turn queries.

    ScaNN

    Google Research's efficient vector similarity search implementation using asymmetric quantization and anisotropic quantization, optimized for billion-scale retrieval with high recall.

    FastPLAID

    Optimized implementation of PLAID index for fast ColBERT retrieval, providing 10x storage compression and sub-200ms latency. Default index backend for PyLate library, enabling efficient multi-vector late interaction retrieval.

    CSRv2

    Contrastive Sparse Representation learning approach for ultra-sparse embeddings that achieves 7x speedup over Matryoshka Representation Learning with 300x improvements in compute and memory efficiency.

    EmbeddingGemma

    Google's text embedding model based on the Gemma architecture, available through Ollama and other platforms. Designed for generating high-quality embeddings for semantic search, retrieval, and various NLP tasks with efficient resource utilization.

    Overview

    MUVERA (MUlti-VEctor Retrieval Algorithm) is a retrieval mechanism which reduces multi-vector similarity search to single-vector similarity search. MUVERA's innovation is to take whole groups of multi-vectors and compress them into a single, easier-to-handle vector called a Fixed Dimensional Encoding (FDE).

    Key Innovation

    MUVERA asymmetrically generates Fixed Dimensional Encodings (FDEs) of queries and documents, which are vectors whose inner product approximates multi-vector similarity. FDEs give high-quality ε-approximations, thus providing the first single-vector proxy for multi-vector similarity with theoretical guarantees.

    Performance Benefits

    Retrieval Efficiency

    • FDEs achieve the same recall as prior state-of-the-art heuristics while retrieving 2-5× fewer candidates
    • Achieves an average of 10% improved recall with 90% lower latency
    • Consistently good end-to-end recall and latency across diverse BEIR retrieval datasets

    Data-Oblivious Design

    A key advantage of MUVERA is that the FDE transformation is data-oblivious, meaning it doesn't depend on the specific dataset. This makes it:

    • Robust to changes in data distribution
    • Suitable for streaming applications
    • Adaptable to new data without retraining

    Theoretical Foundations

    MUVERA provides theoretical guarantees for approximation quality, making it the first single-vector proxy for multi-vector similarity with such guarantees.

    Production Deployment

    The research was presented at NeurIPS 2024 and has been implemented in production systems. Weaviate added MUVERA support in version 1.31, demonstrating its practical value for real-world vector search applications.

    Use Cases

    • Large-scale semantic search requiring efficiency
    • Production systems needing reduced latency
    • Applications with streaming data requirements
    • Systems requiring both high recall and low latency

    Pricing

    Free research from Google, with open implementations available.