PQ (Product Quantization)
Product Quantization is a compression and indexing technique for vector search that splits vectors into subspaces and quantizes each part separately, allowing vector databases to store large-scale embeddings compactly while supporting efficient ANN search.
About this tool
title: "PQ (Product Quantization)" slug: "pq-product-quantization" brand: "faiss" brand_logo_url: "https://faiss.ai/images/faiss_logo_black.svg" category: "concepts-definitions" featured: false tags:
- quantization
- ann
- vector-compression images:
- "https://faiss.ai/images/pq.png" source_url: "https://cybergarden.au/blog/5-powerful-vector-database-tools-2025"
Overview
PQ (Product Quantization) is a vector compression and indexing technique commonly used in vector databases and similarity search libraries (such as Faiss). It enables efficient approximate nearest neighbor (ANN) search by splitting high-dimensional vectors into multiple subspaces and quantizing each subspace separately. This allows large-scale embedding collections to be stored compactly while still supporting fast semantic search.
Key Idea
Instead of storing full-precision vectors, Product Quantization:
- Divides each original vector into smaller, disjoint sub-vectors (subspaces).
- Learns a separate codebook (set of centroids) for each subspace.
- Represents each sub-vector by the index of its nearest centroid in the corresponding codebook.
The resulting compressed representation significantly reduces memory usage while preserving enough structure for efficient ANN search.
Features
-
Vector splitting into subspaces
High-dimensional vectors are partitioned into multiple lower-dimensional blocks, enabling independent quantization of each part. -
Subspace quantization (codebooks)
Each subspace has its own learned codebook of centroids; sub-vectors are approximated by the nearest centroid index. -
Compact embedding storage
Full-precision floats are replaced with short integer codes, reducing RAM and disk footprint for large collections of embeddings. -
Efficient approximate nearest neighbor (ANN) search
Distance computations are performed in the compressed space using precomputed lookup tables, enabling fast similarity search over millions or billions of vectors. -
Speed–accuracy trade-off control
The number of subspaces, centroids per subspace, and code size can be tuned to balance search accuracy against memory usage and query latency. -
Compatibility with vector indexes
Often combined with other ANN indexing techniques (e.g., IVF, HNSW in systems like Faiss) to further improve search performance at scale. -
Scalability to large datasets
Designed to support very large vector collections by minimizing per-vector storage, making it practical to keep massive embedding sets in memory or on fast storage. -
Support for semantic search use cases
Well-suited for applications that rely on embedding similarity (e.g., text, image, or multimodal search) where exact nearest neighbor search would be too expensive.
Use Cases
- Large-scale semantic search over text, images, or documents.
- Vector databases powering LLM-based retrieval, RAG, and recommendation.
- Any scenario requiring memory-efficient storage of high-dimensional embeddings with fast ANN queries.
Pricing
Product Quantization (PQ) is a technique/concept, not a standalone commercial product, so there is no pricing model associated with it directly. Pricing depends on the specific database or library (e.g., Faiss or a managed vector database) in which PQ is implemented.
Loading more......
Information
Categories
Tags
Similar Products
6 result(s)RaBitQ is an open-source library implementing the "Quantizing High-Dimensional Vectors with a Theoretical Error Bound for Approximate Nearest Neighbor Search" method, providing vector quantization and compression techniques designed to improve efficiency and accuracy of ANN search engines and vector databases operating in high-dimensional spaces.
nanopq is a lightweight product quantization library for efficient vector compression and similarity search, which is an important feature for vector databases that need to store and query large-scale vector data efficiently.
IVF is an indexing technique widely used in vector databases where vectors are clustered into inverted lists (partitions), enabling efficient Approximate Nearest Neighbor search by probing only a subset of relevant partitions at query time.
Locality-Sensitive Hashing (LSH) is an algorithmic technique for approximate nearest neighbor search in high-dimensional vector spaces, commonly used in vector databases to speed up similarity search while reducing memory footprint.
Online Product Quantization (O-PQ) is a variant of product quantization designed to support dynamic or streaming data. It enables adaptive updating of quantization codebooks and codes in real-time, making it suitable for vector databases that handle evolving datasets.
Optimized Product Quantization (OPQ) enhances Product Quantization by optimizing space decomposition and codebooks, leading to lower quantization distortion and higher accuracy in vector search. OPQ is widely used in advanced vector databases for improving recall and search quality.