AiSAQ
AiSAQ is an all-in-storage approximate nearest neighbor search system that uses product quantization to enable DRAM-free vector similarity search, serving as a specialized vector search/indexing approach for large-scale information retrieval.
About this tool
title: AiSAQ slug: aisaq category: Research Papers & Surveys tags:
- ann
- similarity-search
- vector-indexing source_url: https://arxiv.org/pdf/2404.06004.pdf featured: false
Overview
AiSAQ (All-in-Storage ANNS with Product Quantization) is a research method for approximate nearest neighbor search (ANNS) that places compressed vectors entirely on SSD, enabling DRAM-free (or near-DRAM-free) vector similarity search at billion-scale. It builds on DiskANN, modifying how product-quantized vectors are stored and accessed to drastically cut RAM usage while maintaining high recall and practical latency.
Key Details
- Type: Research paper / algorithmic method
- Domain: Large-scale vector search, information retrieval, RAG backends
- Core idea: Offload PQ-compressed vectors from DRAM into SSD-based indices, so memory usage no longer scales with dataset size.
- Code: DiskANN-based implementation available on GitHub: https://github.com/KioxiaAmerica/aisaq-diskann
Features
-
All-in-storage PQ design
- Compressed (product-quantized) node vectors are stored on SSD instead of being kept in DRAM.
- Breaks the proportional relationship between DRAM usage and dataset size.
-
Extremely low DRAM footprint
- Achieves on the order of ~10 MB memory usage for query search on billion-scale vector datasets (as reported in the paper abstract).
- Suitable for environments where DRAM is costly or limited.
-
Based on DiskANN
- Uses DiskANN as the underlying graph-based ANNS framework.
- Preserves the graph-search paradigm and re-ranking strategy, but changes where/how compressed vectors are stored.
-
Product Quantization (PQ) for compression
- Employs PQ to represent high-dimensional vectors compactly.
- Relieves the DiskANN trade-off where increasing compression lowers both memory use and recall, by moving PQ data to storage.
-
Maintains recall–latency balance
- Designed to achieve DRAM-free or near-DRAM-free search “without critical latency degradation” compared to standard DiskANN setups.
- Still uses full-precision vectors from storage for re-ranking along the search path.
-
Fast index switching
- Reduces index load time required before queries can be served.
- Makes it practical to switch rapidly between multiple billion-scale indices, useful when many vector collections must be queried selectively.
-
Suitable for RAG (Retrieval-Augmented Generation)
- Can act as a retriever backend for LLM-based RAG systems.
- Multiple external knowledge sources can be stored as separate indices and switched on demand, without loading all index data into RAM.
-
Scalability and multi-server deployment
- Intended to scale out across multiple-server systems for emerging, very large datasets.
- SSD-based index design fits distributed or sharded deployments.
-
Use with vector database systems
- Conceptually related to existing DiskANN-based services used in vector databases such as Weaviate and Zilliz (mentioned as context in the paper).
Use Cases
- Large-scale image, music, or document retrieval where dataset size reaches billions of vectors.
- RAG systems that:
- Need to query multiple knowledge bases / indices.
- Require fast index switching without reloading large indices into DRAM.
- Cost-sensitive deployments where minimizing DRAM footprint is essential while still needing high-quality ANN search.
Pricing
- This is a research method / open implementation, not a commercial product.
- No pricing or commercial plans are specified in the paper content.
Loading more......
Information
Categories
Tags
Similar Products
6 result(s)Reconfigurable Inverted Index (Rii) is a research project and open-source library for approximate nearest neighbor and similarity search over high-dimensional vectors. It focuses on flexible, reconfigurable inverted index structures that support efficient vector search, making it directly relevant as a vector-search engine component for AI and multimedia retrieval applications.
This work by Jingfan Meng is a comprehensive research thesis on efficient locality-sensitive hashing (LSH), covering algorithmic solutions, core primitives, and applications for approximate nearest neighbor search. It is relevant to vector databases because LSH-based indexing is a foundational technique for scalable similarity search over high-dimensional vectors, informing the design of vector indexes, retrieval engines, and similarity search modules in modern vector database systems.
GTS is a GPU-based tree index for fast similarity search over high-dimensional vector data, providing an efficient ANN index structure that can be integrated into or used to build high-performance vector database systems.
Cagra provides highly parallel graph construction and approximate nearest neighbor search for GPUs, supporting large-scale vector database operations and efficient similarity search.
A category of vector database solutions and algorithms leveraging graph-based approaches for efficient similarity search and vector indexing, which are core to many vector database implementations in AI applications.
Ruby gem for approximate nearest neighbor search that can integrate with pgvector and other backends to power vector similarity search in Ruby applications.