• Home
  • Categories
  • Tags
  • Pricing
  • Submit
  1. Home
  2. Curated Resource Lists
  3. IntelLabs's Vector Search Datasets

IntelLabs's Vector Search Datasets

A collection of datasets curated by Intel Labs specifically for evaluating and benchmarking vector search algorithms and databases.

🌐Visit Website

About this tool

IntelLabs's Vector Search Datasets

A collection of datasets curated by Intel Labs for evaluating and benchmarking vector search algorithms and databases.

Features

  • Provides code to generate several datasets for similarity search benchmarking and evaluation.
  • Datasets are based on high-dimensional vectors from recent deep learning models.
  • Includes multiple datasets (see respective folders: dpr, openimages, rqa, text, wit).
  • Each dataset comes with its own README file for details and usage instructions.
  • Useful for researchers and developers working on vector search, similarity search, and related benchmarking tasks.

Notes

  • Project Status: Not under active management. Intel has ceased development, maintenance, and contributions to this project.
  • Users interested in further development or maintenance are encouraged to fork the repository.

Source

https://github.com/IntelLabs/VectorSearchDatasets

Tags

datasets, vector-search, benchmark, evaluation

Category

Curated Resource Lists

Surveys

Loading more......

Information

Websitegithub.com
PublishedMay 13, 2025

Categories

1 Item
Curated Resource Lists

Tags

4 Items
#datasets
#vector search
#benchmark
#evaluation

Similar Products

6 result(s)
BEIR

BEIR (Benchmarking IR) is a benchmark suite for evaluating information retrieval and vector search systems across multiple tasks and datasets. Useful for comparing vector database performance.

SISAP Indexing Challenge

An annual competition focused on similarity search and indexing algorithms, including approximate nearest neighbor methods and high-dimensional vector indexing, providing benchmarks and results relevant to vector database research.

VectorDBBench

The open‑source repository containing the implementation, configuration, and scripts of VectorDBBench, enabling users to run standardized benchmarks across multiple vector database systems locally or in CI.

MTEB: Massive Text Embedding Benchmark

A massive text embedding benchmark for evaluating the quality of text embedding models, crucial for vector database applications.

ANN-Benchmarks

ANN-Benchmarks is a benchmarking platform specifically for evaluating the performance of approximate nearest neighbor (ANN) search algorithms, which are foundational to vector database evaluation and comparison.

Zeng, Xianzhi, et al. "CANDY: A Benchmark for Continuous Approximate Nearest Neighbor Search with Dynamic Data Ingestion."

A 2024 paper introducing CANDY, a benchmark for continuous ANN search with a focus on dynamic data ingestion, crucial for next-generation vector databases.

Built with
Ever Works
Ever Works

Connect with us

Stay Updated

Get the latest updates and exclusive content delivered to your inbox.

Product

  • Categories
  • Tags
  • Pricing
  • Help

Clients

  • Sign In
  • Register
  • Forgot password?

Company

  • About Us
  • Admin
  • Sitemap

Resources

  • Blog
  • Submit
  • API Documentation
All product names, logos, and brands are the property of their respective owners. All company, product, and service names used in this repository, related repositories, and associated websites are for identification purposes only. The use of these names, logos, and brands does not imply endorsement, affiliation, or sponsorship. This directory may include content generated by artificial intelligence.
Copyright © 2025 Acme. All rights reserved.·Terms of Service·Privacy Policy·Cookies