LibVQ
LibVQ is an open-source toolkit for optimizing vector quantization and efficient neural retrieval, offering training and indexing components that can serve as the core of high-performance approximate nearest neighbor search and vector database systems.
About this tool
title: LibVQ slug: libvq category: sdks-libraries source_url: https://github.com/staoxiao/LibVQ images:
- https://opengraph.githubassets.com/1/staoxiao/LibVQ tags:
- vector-quantization
- neural-search
- ann
Description
LibVQ is an open-source library for dense-retrieval–oriented vector quantization. It provides training and indexing components that optimize vector quantization for retrieval quality, and can serve as the core of high-performance approximate nearest neighbor (ANN) search and vector database systems.
Features
-
Dense retrieval–oriented vector quantization
- Designed specifically to improve retrieval quality compared with conventional VQ methods (e.g., IVF, PQ, OPQ).
- Targets real-time and memory-efficient dense retrieval scenarios.
-
Knowledge distillation–based learning
- Uses knowledge distillation to learn VQ parameters from off-the-shelf embeddings.
- Can directly operate on existing dense embeddings without modifying upstream models.
- Aims to achieve strong retrieval metrics (e.g., MRR@10, Recall@10/100) compared to other VQ-based ANN indexes.
-
Flexible usage modes
- Train only VQ/index parameters while keeping encoders fixed.
- Jointly adapt and train the query encoder together with the index.
- Supports different training strategies (e.g., contrastive index training, distillation-based index training).
-
Rich input condition support
- Works with only off-the-shelf embeddings when no extra signals are available.
- Optionally leverages extra supervision such as:
- Relevance labels.
- Source queries.
- Can be configured for both labeled and no-label (unlabeled) training settings.
-
PyTorch-based training
- Training pipeline implemented with PyTorch.
- Configurable for different computation resources and training setups.
-
FAISS-backed ANN deployment
- Exports trained VQ parameters to FAISS-based indexes (e.g.,
IndexPQ,IndexIVFPQ). - Resulting indexes are directly deployable for large-scale dense retrieval.
- Integrates with common ANN backends similar to FAISS, ScaNN, etc.
- Exports trained VQ parameters to FAISS-based indexes (e.g.,
-
Example workflows and benchmarks
- Example pipelines for constructing and training indexes (documented in the
Docsandexamplesfolders). - MSMARCO example demonstrating:
- IVFPQ and PQ settings with a fixed compression ratio (e.g., 96x compression).
- Multiple training recipes, including:
contrastive_indexdistill_indexdistill_index_nolabelcontrastive_index-and-query-encoderdistill_index-and-query-encoderdistill_index-and-query-encoder_nolabel
- Reported metrics such as MRR@10, Recall@10, Recall@100 for these methods and for baseline FAISS/ScaNN indexes.
- Example pipelines for constructing and training indexes (documented in the
-
Simple installation from source
- Installable via
pipafter cloning the repository:git clone https://github.com/staoxiao/LibVQ.gitcd LibVQpip install .
- Installable via
Installation
git clone https://github.com/staoxiao/LibVQ.git
cd LibVQ
pip install .
Pricing
LibVQ is an open-source library; no pricing information or paid plans are specified.
Loading more......
Information
Categories
Tags
Similar Products
6 result(s)SymphonyQG is a research codebase and method that integrates vector quantization with graph-based indexing to build efficient approximate nearest neighbor (ANN) indexes for high-dimensional vector search. It targets vector database and similarity search scenarios where combining compact codes with navigable graphs can improve recall–latency tradeoffs and memory footprint.
Ruby gem for approximate nearest neighbor search that can integrate with pgvector and other backends to power vector similarity search in Ruby applications.
EFANNA is an extremely fast approximate nearest neighbor search algorithm based on kNN graphs and randomized KD-trees. The provided implementation offers a high-performance ANN index suitable as a building block in custom vector search and retrieval infrastructure.
A Go implementation of the HNSW approximate nearest neighbor search algorithm, enabling developers to embed efficient vector similarity search directly into Go services and custom vector database solutions.
A Rust implementation of the HNSW (Hierarchical Navigable Small World) approximate nearest neighbor search algorithm, useful for building high-performance, memory-safe vector search components in Rust-based AI and retrieval systems.
iRangeGraph is an ANN indexing approach and accompanying implementation for range-filtering nearest neighbor search. It provides a specialized graph-based index that supports vector similarity search under range constraints, making it directly useful as a component or reference implementation for advanced vector database indexing and retrieval.