A collection of datasets curated by Intel Labs specifically for evaluating and benchmarking vector search algorithms and databases.
Loading more......
BEIR (Benchmarking IR) is a benchmark suite for evaluating information retrieval and vector search systems across multiple tasks and datasets. Useful for comparing vector database performance.
An annual competition focused on similarity search and indexing algorithms, including approximate nearest neighbor methods and high-dimensional vector indexing, providing benchmarks and results relevant to vector database research.
The open‑source repository containing the implementation, configuration, and scripts of VectorDBBench, enabling users to run standardized benchmarks across multiple vector database systems locally or in CI.
A massive text embedding benchmark for evaluating the quality of text embedding models, crucial for vector database applications.
ANN-Benchmarks is a benchmarking platform specifically for evaluating the performance of approximate nearest neighbor (ANN) search algorithms, which are foundational to vector database evaluation and comparison.
A 2024 paper introducing CANDY, a benchmark for continuous ANN search with a focus on dynamic data ingestion, crucial for next-generation vector databases.
A collection of datasets curated by Intel Labs for evaluating and benchmarking vector search algorithms and databases.
dpr, openimages, rqa, text, wit).https://github.com/IntelLabs/VectorSearchDatasets
datasets, vector-search, benchmark, evaluation
Curated Resource Lists