Locality-Sensitive Hashing (LSH) is an algorithmic technique for approximate nearest neighbor search in high-dimensional vector spaces, commonly used in vector databases to speed up similarity search while reducing memory footprint.
Spectral Hashing is a method for approximate nearest neighbor search that uses spectral graph theory to generate compact binary codes, often applied in vector databases to enhance retrieval efficiency on large-scale, high-dimensional data.
NMSLIB is an efficient similarity search library and toolkit for high-dimensional vector spaces, supporting a variety of indexing algorithms for vector database use cases.
K-means Tree is a clustering-based data structure that organizes high-dimensional vectors for fast similarity search and retrieval. It is used as an indexing method in some vector databases to optimize performance for vector search operations.
Optimized Product Quantization (OPQ) enhances Product Quantization by optimizing space decomposition and codebooks, leading to lower quantization distortion and higher accuracy in vector search. OPQ is widely used in advanced vector databases for improving recall and search quality.
An open-source library for approximate nearest neighbor search in high-dimensional spaces, often used as a backend for vector databases and search engines.
FAISS (Facebook AI Similarity Search) is a popular open-source library for efficient similarity search and clustering of dense vectors. Developed by Facebook/Meta, it supports billions of vectors and is widely used to power vector search engines and databases, especially where raw speed and scalability are needed.
Category: Concepts & Definitions
Tags: ann, similarity-search, high-dimensional, optimization
Source: Wikipedia - Locality-sensitive hashing
Locality-Sensitive Hashing (LSH) is an algorithmic technique in computer science designed for approximate nearest neighbor search in high-dimensional spaces. It is a form of fuzzy hashing that hashes similar input items into the same "buckets" with high probability, facilitating efficient similarity search and data clustering. Unlike traditional hash functions which minimize collisions, LSH maximizes collisions for similar items, making it especially useful for reducing the dimensionality of data while preserving relative distances.
Not applicable (concept/algorithm, not a commercial product).