
Locality Sensitive Hashing (LSH)
Algorithmic technique for approximate nearest neighbor search in high-dimensional spaces using hash functions to map similar items to the same buckets with high probability.
About this tool
Overview
Locality-Sensitive Hashing (LSH) is an algorithmic technique that hashes similar input items into the same "buckets" with high probability (the number of buckets being much smaller than the universe of possible input items).
How LSH Works
LSH uses hash functions that map similar items to the same hash bucket with high probability, unlike traditional hash functions that try to avoid collisions. This property makes it particularly useful for approximate nearest neighbor search in high-dimensional spaces.
Implementation Libraries
Python Libraries
- FALCONN - Based on newer NIPS'15 paper ideas (2015-2016)
- E2LSH - Older version of LSH available as alpha-version
- LSHash by Kay Zhu - Fast hash calculation for large amounts of high dimensional data through numpy arrays
- SparseLSH - Emphasis on large, highly-dimensional datasets with sparse matrices
- mattilyra/LSH - Uses MurmurHash v3 library with Cython for performance
Java Library
java-LSH - Implements LSH as described in Leskovec, Rajaraman & Ullman (2014), "Mining of Massive Datasets"
C++ Library
LSHKIT - A C++ Locality Sensitive Hashing Library
Production Alternatives
For production use, libraries built for similarity search like FAISS or managed solutions like Pinecone are recommended instead of basic LSH implementations.
Use Cases
- Approximate nearest neighbor search
- Duplicate detection
- Image similarity search
- Document deduplication
- Recommendation systems
Pricing
Various open-source implementations available.
Loading more......
Information
Categories
Tags
Similar Products
6 result(s)