vsag
vsag is an Alibaba open-source library implementing efficient vector search algorithms, including approximate nearest neighbor search for high-dimensional vectors.
About this tool
vsag
Brand: Alibaba
Category: SDKs & Libraries
Type: Open‑source vector indexing library
Overview
vsag is an open-source C++ vector indexing library for high-dimensional similarity search, designed to handle vector datasets that may not fit entirely in memory. It implements efficient vector search algorithms, including approximate nearest neighbor (ANN) search, and provides a Python wrapper package, pyvsag, for easier integration in Python environments.
Features
-
Vector similarity search
- Supports high-dimensional similarity search and approximate nearest neighbor (ANN) queries.
- Designed to work with vector sets of various sizes, including those that do not fit into memory.
-
Indexing algorithms
- Implements the VSAG algorithm for efficient vector indexing and search.
- Includes SINDI algorithm optimized for sparse vector search.
- Provides example implementations such as HNSW-based indexing (e.g.,
101_index_hnsw.cpp,example_hnsw.py).
-
Performance characteristics
- SINDI algorithm for sparse vector search targets state-of-the-art performance, significantly improving QPS (queries per second) compared with prior solutions in internal tests.
- Demonstrated efficiency on:
- Sparse-full-inner-product tasks, with substantial QPS gains over previous SOTA and baseline algorithms on benchmark datasets.
- gist-960-euclidean benchmarks, tested in the ann-benchmarks framework.
- Benchmarks reported on multi-core CPU environments (e.g., Intel Xeon CPUs, AWS r6i.16xlarge), suitable for large-scale, CPU-based deployments.
-
Automatic parameter generation
- Provides methods to generate algorithm parameters based on vector dimensions and data scale.
- Aims to let developers use the library effectively without needing deep knowledge of the underlying algorithms.
-
Language support
- C++ core implementation.
- Python wrapper package: pyvsag available on PyPI.
-
Integration & build
- CMake integration via
FetchContentfor easy inclusion in C++ projects. - Public C++ headers available via
includedirectory. - Example CMake configuration for fetching and linking
vsagin user projects. - Build-from-source instructions provided in a
DEVELOPMENT.mdguide in the repository.
- CMake integration via
-
Examples and usage
- C++ and Python example programs provided in the
examplesdirectory. - Starter examples include:
101_index_hnsw.cpp(C++)example_hnsw.py(Python)
- C++ and Python example programs provided in the
-
Ecosystem adoption
- Used by multiple database and data systems, including (as listed in the repo):
- OceanBase
- TuGraph
- GreptimeDB
- Hologres
- PolarDB
- Used by multiple database and data systems, including (as listed in the repo):
Tech Stack
- Core language: C++ (C++11 standard or later)
- Build system: CMake (3.11+ recommended)
- Python wrapper: pyvsag (via PyPI)
Pricing
vsag is an open-source library. No pricing plans are listed; usage is free under the terms of its open-source license (see the GitHub repository for license details).
Links
- Source code & documentation: https://github.com/alipay/vsag
- Python package (pyvsag): https://pypi.org/project/pyvsag/
- Benchmarks information: https://ann-benchmarks.com/ (external benchmark framework referenced by the project)
Loading more......
Information
Categories
Tags
Similar Products
6 result(s)An open-source library for approximate nearest neighbor search in high-dimensional spaces, often used as a backend for vector databases and search engines.
MRPT (Multi-Resolution Proximity Trees) is an open-source library for fast approximate nearest neighbor search in high-dimensional vector spaces, applicable to vector database backends.
An influential paper analyzing and improving approximate nearest neighbor search methods for high-dimensional data, highly relevant for developing and understanding vector databases.
A Go implementation of the HNSW approximate nearest neighbor search algorithm, enabling developers to embed efficient vector similarity search directly into Go services and custom vector database solutions.
A Rust implementation of the HNSW (Hierarchical Navigable Small World) approximate nearest neighbor search algorithm, useful for building high-performance, memory-safe vector search components in Rust-based AI and retrieval systems.
jvector is a high-performance Java-based library and engine for vector search and approximate nearest neighbor indexing.