usearch
usearch is a fast, open-source search and clustering engine designed for efficient vector search across multiple programming languages.
About this tool
usearch
Description
usearch is a fast, open-source search and clustering engine for similarity search on vectors (and strings in the future), supporting a wide range of programming languages. It is designed to be efficient, compact, and flexible, enabling both approximate and exact search, clustering, and various customizations.
Features
- Similarity Search & Clustering: Supports both approximate (HNSW) and exact (brute-force) similarity search for high-dimensional vectors.
- Multi-language Support: Native bindings for C++, C, Python, JavaScript, Rust, Java, Objective-C, Swift, C#, GoLang, and Wolfram.
- Cross-platform: Runs on Linux, MacOS, Windows, iOS, Android, WebAssembly, and supports SQLite3 integration.
- User-defined Metrics: Allows custom similarity metrics beyond standard inner product and Euclidean distance. Supports custom metrics with Numba, Cppyy, or PeachPy in Python, and similar support in C, C++, and Rust.
- Efficient Indexing: Compact codebase, minimal dependencies, and optimized for speed and low memory usage; can handle billions to trillions of entries with efficient storage (32-bit, 40-bit, and 64-bit IDs).
- Flexible Filtering: Supports predicate functions for filtering search results during traversal, including integration with external containers (e.g., Bloom filters, third-party databases).
- Quantization & Memory Efficiency: Supports automatic up-casting/down-casting between numeric types (f64, f32, f16, i8, b1x8), quantization for RAM savings, and special handling for binary and cosine-like metrics.
- Serialization & Disk Serving: Indexes can be saved, loaded, and served from disk, including external memory for cost-efficient deployment.
- Batch Operations: Supported in some bindings (notably Python and JavaScript), allowing efficient bulk add/search operations.
- Joins & Mappings: Enables approximate, fuzzy, and semantic joins for database and data management use cases.
- Clustering: Fast K-Nearest Neighbors clustering and dimensionality reduction (e.g., PCA), often outperforming traditional clustering libraries.
- Multi-index Lookups: Supports building and querying multiple indexes in parallel for extremely large-scale workloads.
- Variable-length Vectors: Supported in C++.
- Large-scale Capacity: Custom 40-bit type for efficient addressing of 4B+ entries.
- Concurrent Design: Index structure is concurrent, enabling multi-threaded operations.
- Application Integrations: Examples include semantic search (with UForm, UCall), molecular similarity (with RDKit), geospatial search using Haversine distance, and more.
- Lightweight Deployments: Minimal dependencies, small binding sizes (e.g., Python binding < 1 MB), and native bindings for low latency.
Supported Languages and Features Table
| Feature | C++ | Python | C | Java | JavaScript | Rust | GoLang | Swift | |------------------------|-----|--------|-----|------|------------|------|--------|-------| | Add, search, remove | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | Save, load, view | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | User-defined metrics | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | | Batch operations | ❌ | ✅ | ❌ | ❌ | ✅ | ❌ | ❌ | ❌ | | Filter predicates | ✅ | ❌ | ✅ | ❌ | ❌ | ✅ | ❌ | ❌ | | Joins | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | | Variable-length vectors| ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | | 4B+ capacities | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
Application Examples
- Semantic Search: Text-to-image search platforms, multimodal semantic search using encoder models.
- Molecular Search: Searching molecular structures using binary fingerprints (e.g., Tanimoto coefficient).
- Geospatial Search: GIS applications with Haversine and custom distance functions.
Licensing
- Open-source
Pricing
- No pricing information provided; open-source project.
Loading more......
Information
Categories
Similar Products
6 result(s)Voyager is a Spotify open-source vector search library and service for efficient nearest neighbor search on large-scale vector datasets.
An open-source library for approximate nearest neighbor search in high-dimensional spaces, often used as a backend for vector databases and search engines.
Deep Searcher is a local open-source deep research solution that integrates Milvus and LangChain to provide advanced vector search and retrieval capabilities using open-source models.
HNSWLIB is a C++ library with Python bindings for fast approximate nearest neighbor search using Hierarchical Navigable Small World (HNSW) graphs, commonly used in vector database backends.
PostgreSQL supports vector indexing and similarity search via the PGVector extension, allowing relational databases to manage and retrieve vector embeddings efficiently.
RediSearch is a Redis module that provides high-performance vector search and similarity search capabilities on top of Redis, enabling advanced search and retrieval features for AI and data applications.