IVF-PQ (Inverted File with Product Quantization)

Vector indexing method combining inverted file index with product quantization for memory-efficient search. Reduces storage from 128x4 bytes to 32x1 bytes (1/16th) while maintaining search quality.

Visit Website

Overview

IVF-PQ (Inverted File with Product Quantization) is a vector indexing method that combines two techniques: inverted file indexing for efficient search space reduction and product quantization for memory-efficient vector storage.

How It Works

Inverted File (IVF)

Partitions the vector space into clusters
Creates an inverted index mapping clusters to vectors
During search, only relevant clusters are examined

Product Quantization (PQ)

Divides vectors into subvectors
Quantizes each subvector independently
Dramatically reduces memory footprint

Storage Efficiency

For 128-dimensional vectors divided into 32 subvectors:

Original storage: 128 × 4 bytes = 512 bytes
IVF-PQ storage: 32 × 1 byte = 32 bytes
Compression ratio: 1/16th of original size

Performance Characteristics

ScaNN (which builds upon IVF-PQ) achieves:

5x QPS improvement over IVFFLAT on Cohere1M dataset
6x QPS improvement over basic IVF-PQ
Maintains high recall rates with compressed vectors

Relationship to ScaNN

ScaNN is based on the IVF-PQ framework but introduces key optimizations:

Score-aware quantization loss
Anisotropic loss functions
SIMD in-register lookup tables

Applications

Large-scale vector search with memory constraints
Balancing search speed and memory usage
Systems requiring high throughput with limited resources

Trade-offs

Reduces memory usage significantly
Slight reduction in recall compared to exact search
Faster than exact search but slower than some graph-based methods

Surveys

Loading more......

Information

Websitelancedb.com

PublishedMar 8, 2026

Tags

3 Items

#quantization #indexing #compression

Similar Products

Binary Quantization for Vector Search

Compression technique that converts full-precision vectors to binary representations, achieving 32x storage reduction while maintaining 90-95% recall for efficient large-scale vector search.

000

Statistical Binary Quantization

Compression method developed by Timescale researchers that improves on standard Binary Quantization, reducing vector memory footprint by 32x while maintaining high accuracy for filtered searches.

000

BBQ Binary Quantization

Elasticsearch and Lucene's implementation of RaBitQ algorithm for 1-bit vector quantization, renamed as BBQ. Provides 32x compression with asymptotically optimal error bounds, enabling efficient vector search at massive scale with minimal accuracy loss.

000

Locally-Adaptive Vector Quantization

Advanced quantization technique that applies per-vector normalization and scalar quantization, adapting the quantization bounds individually for each vector. Achieves four-fold reduction in vector size while maintaining search accuracy with 26-37% overall memory footprint reduction.

000

Anisotropic Vector Quantization

An advanced quantization technique introduced by Google's ScaNN that prioritizes preserving parallel components between vectors rather than minimizing overall distance. Optimized for Maximum Inner Product Search (MIPS) and significantly improves retrieval accuracy.

000

Binary Quantization

Extreme vector compression technique converting each dimension to a single bit (0 or 1), achieving 32x memory reduction and enabling ultra-fast Hamming distance calculations with acceptable accuracy trade-offs.

000

Overview

How It Works

Inverted File (IVF)

Partitions the vector space into clusters
Creates an inverted index mapping clusters to vectors
During search, only relevant clusters are examined

Product Quantization (PQ)

Divides vectors into subvectors
Quantizes each subvector independently
Dramatically reduces memory footprint

Storage Efficiency

For 128-dimensional vectors divided into 32 subvectors:

Original storage: 128 × 4 bytes = 512 bytes
IVF-PQ storage: 32 × 1 byte = 32 bytes
Compression ratio: 1/16th of original size

Performance Characteristics

ScaNN (which builds upon IVF-PQ) achieves:

5x QPS improvement over IVFFLAT on Cohere1M dataset
6x QPS improvement over basic IVF-PQ
Maintains high recall rates with compressed vectors

Relationship to ScaNN

ScaNN is based on the IVF-PQ framework but introduces key optimizations:

Score-aware quantization loss
Anisotropic loss functions
SIMD in-register lookup tables

Applications

Large-scale vector search with memory constraints
Balancing search speed and memory usage
Systems requiring high throughput with limited resources

Trade-offs

Reduces memory usage significantly
Slight reduction in recall compared to exact search
Faster than exact search but slower than some graph-based methods

IVF-PQ (Inverted File with Product Quantization)

Overview

How It Works

Inverted File (IVF)

Product Quantization (PQ)

Storage Efficiency

Performance Characteristics

Relationship to ScaNN

Applications

Trade-offs

Information

Categories

Tags

Similar Products

Connect with us

Stay Updated

Product

Clients

Company

Resources

IVF-PQ (Inverted File with Product Quantization)

Overview

How It Works

Inverted File (IVF)

Product Quantization (PQ)

Storage Efficiency

Performance Characteristics

Relationship to ScaNN

Applications

Trade-offs

Information

Categories

Tags

Similar Products