• Home
  • Categories
  • Tags
  • Pricing
  • Submit
    Decorative pattern
    1. Home
    2. Curated Resource Lists
    3. IntelLabs's Vector Search Datasets

    IntelLabs's Vector Search Datasets

    A collection of datasets curated by Intel Labs specifically for evaluating and benchmarking vector search algorithms and databases.

    🌐Visit Website

    About this tool

    IntelLabs's Vector Search Datasets

    A collection of datasets curated by Intel Labs for evaluating and benchmarking vector search algorithms and databases.

    Features

    • Provides code to generate several datasets for similarity search benchmarking and evaluation.
    • Datasets are based on high-dimensional vectors from recent deep learning models.
    • Includes multiple datasets (see respective folders: dpr, openimages, rqa, text, wit).
    • Each dataset comes with its own README file for details and usage instructions.
    • Useful for researchers and developers working on vector search, similarity search, and related benchmarking tasks.

    Notes

    • Project Status: Not under active management. Intel has ceased development, maintenance, and contributions to this project.
    • Users interested in further development or maintenance are encouraged to fork the repository.

    Source

    https://github.com/IntelLabs/VectorSearchDatasets

    Tags

    datasets, vector-search, benchmark, evaluation

    Category

    Curated Resource Lists

    Surveys

    Loading more......

    Information

    Websitegithub.com
    PublishedMay 13, 2025

    Categories

    1 Item
    Curated Resource Lists

    Tags

    4 Items
    #Datasets#Vector Search#Benchmark#Evaluation

    Similar Products

    6 result(s)
    BEIR

    BEIR (Benchmarking IR) is a benchmark suite for evaluating information retrieval and vector search systems across multiple tasks and datasets. Useful for comparing vector database performance.

    MTEB Leaderboard
    Featured

    Massive Text Embedding Benchmark leaderboard covering 58 datasets across 112 languages and 8 embedding tasks. Industry-standard benchmark for comparing text embedding models.

    ViDoRe Benchmark

    Visual Document Retrieval benchmark designed to evaluate embedding models and retrieval systems on visually rich documents containing tables, charts, diagrams, and complex layouts. The standard benchmark for assessing multi-modal document understanding and retrieval performance.

    MTEB (Massive Text Embedding Benchmark)

    Comprehensive benchmark suite for evaluating embedding models across 58 datasets spanning 112 languages and eight task types including retrieval, clustering, and semantic similarity, the standard for comparing embedding quality.

    MMTEB

    Massive Multilingual Text Embedding Benchmark covering over 500 quality-controlled evaluation tasks across 250+ languages, representing the largest multilingual collection of embedding model evaluation tasks.

    Deep1B Dataset

    Billion-scale benchmark dataset containing 96-dimensional deep learning image embeddings. Provides real-world proxy for testing distributed systems and GPU-accelerated vector search at scale.

    Decorative pattern
    Built with
    Ever Works
    Ever Works

    Connect with us

    Stay Updated

    Get the latest updates and exclusive content delivered to your inbox.

    Product

    • Categories
    • Tags
    • Pricing
    • Help

    Clients

    • Sign In
    • Register
    • Forgot password?

    Company

    • About Us
    • Admin
    • Sitemap

    Resources

    • Blog
    • Submit
    • API Documentation
    All product names, logos, and brands are the property of their respective owners. All company, product, and service names used in this repository, related repositories, and associated websites are for identification purposes only. The use of these names, logos, and brands does not imply endorsement, affiliation, or sponsorship. This directory may include content generated by artificial intelligence.
    Copyright © 2025 Awesome Vector Databases. All rights reserved.·Terms of Service·Privacy Policy·Cookies