SymphonyQG
SymphonyQG is a research codebase and method that integrates vector quantization with graph-based indexing to build efficient approximate nearest neighbor (ANN) indexes for high-dimensional vector search. It targets vector database and similarity search scenarios where combining compact codes with navigable graphs can improve recall–latency tradeoffs and memory footprint.
About this tool
SymphonyQG
Category: SDKs & Libraries
Brand: gouyt13
Source: https://github.com/gouyt13/SymphonyQG
SymphonyQG is a research codebase and ANN indexing method that integrates vector quantization with graph-based indexing to build efficient approximate nearest neighbor (ANN) indexes for high-dimensional vector search. It is aimed at vector databases and similarity search workloads where combining compact quantized codes with navigable graphs improves recall–latency trade-offs and reduces memory usage.
Features
Core Capabilities
- Approximate Nearest Neighbor (ANN) search for high-dimensional vectors.
- Hybrid quantization + graph approach ("quantized graph") to balance recall, latency, and memory footprint.
- Configurable similarity metric (e.g., choose distance metric when creating the index).
- Bounded graph degree via
degree_boundparameter to control graph connectivity and index size.
Library Structure
data/- Contains datasets and pre-built indices used for experiments and reproduction.
symqglib/index/fastscan/: helper functions for FastScan-style operations.qg/: implementation of the quantized graph index.
third/: third-party dependencies needed by the library.utils/: common utility functions.
python/- Python bindings exposing the core C++ ANN index to Python.
reproduce/- Scripts and code to reproduce experimental results, including configuration and dataset handling.
test/- Tests for validating core functionality (implied by directory name).
Python API (Recommended Interface)
Index Construction
symphonyqg.Index(index_type, metric, num_elements, dimension, degree_bound=32)
- Parameters
index_type: type of ANN index to build (e.g., specific quantization/graph variant; exact values in repo docs/code).metric: distance/similarity metric (e.g., L2, inner product; details in repo).num_elements: number of vectors the index will hold.dimension: dimensionality of each vector.degree_bound(optional, default = 32): upper bound on graph node degree to control index sparsity and memory.
Methods
-
build_index(data, EF, num_iter=3, num_threads=ALL_THREADS)- Builds the ANN index from training data.
data: input dataset, shape(num_elements, dimension),dtype=float32.EF: construction/search breadth parameter (controls exploration during graph-based search/build).num_iter: number of construction iterations (default3).num_threads: number of threads to use (defaultALL_THREADS).
-
save(filename)- Saves the built index to disk.
-
load(filename)- Loads a previously saved index from disk.
-
set_ef(EF)- Adjusts the runtime search parameter
EFto trade off latency vs. recall without rebuilding the index.
- Adjusts the runtime search parameter
-
search(query, k)- Performs ANN search.
query: query vector(s), shape(dimension,)or(1, dimension),dtype=float32.k: number of nearest neighbors to retrieve.
Examples & Reproducibility
- Python examples: basic usage via Python bindings (creation, build, search, save/load) indicated in
README.md. - Real-world datasets: example configurations and scripts under
./reproduce. - Datasets description: additional details in
./data/README.md. - Reproduction guide: steps for experiments in
./reproduce/README.md.
C++ Usage
- C++ examples are provided (mentioned under “C++ examples”) for users who want to integrate SymphonyQG directly at the C++ level instead of Python.
License
- A
LICENSEfile is present in the repository; consult it directly for the exact open-source license terms.
Pricing
SymphonyQG is an open-source research codebase hosted on GitHub. No pricing or paid plans are specified in the available content.
Loading more......
Information
Categories
Tags
Similar Products
6 result(s)iRangeGraph is an ANN indexing approach and accompanying implementation for range-filtering nearest neighbor search. It provides a specialized graph-based index that supports vector similarity search under range constraints, making it directly useful as a component or reference implementation for advanced vector database indexing and retrieval.
LibVQ is an open-source toolkit for optimizing vector quantization and efficient neural retrieval, offering training and indexing components that can serve as the core of high-performance approximate nearest neighbor search and vector database systems.
NSG is an approximate nearest neighbor search algorithm based on a sparse navigable graph structure designed for high-dimensional vector similarity search. The reference implementation provides a graph-based ANN index that can be integrated into custom vector retrieval systems.
Ruby gem for approximate nearest neighbor search that can integrate with pgvector and other backends to power vector similarity search in Ruby applications.
EFANNA is an extremely fast approximate nearest neighbor search algorithm based on kNN graphs and randomized KD-trees. The provided implementation offers a high-performance ANN index suitable as a building block in custom vector search and retrieval infrastructure.
A Go implementation of the HNSW approximate nearest neighbor search algorithm, enabling developers to embed efficient vector similarity search directly into Go services and custom vector database solutions.