• Home
  • Categories
  • Tags
  • Pricing
  • Submit
    Decorative pattern
    1. Home
    2. Benchmarks & Evaluation
    3. MMTEB

    MMTEB

    Massive Multilingual Text Embedding Benchmark covering over 500 quality-controlled evaluation tasks across 250+ languages, representing the largest multilingual collection of embedding model evaluation tasks.

    🌐Visit Website

    About this tool

    Overview

    MMTEB (Massive Multilingual Text Embedding Benchmark) is a large-scale, community-driven expansion of MTEB, covering over 500 quality-controlled evaluation tasks across 250+ languages. It represents the largest multilingual collection of evaluation tasks for embedding models to date.

    Key Features

    Diverse Task Set

    Includes a diverse set of challenging, novel tasks:

    • Instruction following
    • Long-document retrieval
    • Code retrieval
    • Traditional NLP tasks (classification, clustering, etc.)

    Community-Driven

    Created through a large-scale, open collaboration, with contributors including:

    • Native speakers from diverse linguistic backgrounds
    • NLP practitioners
    • Academic and industry researchers
    • Enthusiasts

    Regional Benchmarks

    From the extensive collection of tasks in MMTEB, several representative benchmarks were developed:

    • MTEB(Multilingual): Highly multilingual benchmark
    • MTEB(Europe): Regional geopolitical benchmark for European languages
    • MTEB(Indic): Regional geopolitical benchmark for Indic languages

    Performance Findings

    While large language models (LLMs) with billions of parameters can achieve state-of-the-art performance on certain language subsets and task categories, the best-performing publicly available model is multilingual-e5-large-instruct with only 560 million parameters.

    Computational Efficiency

    Introduces a novel downsampling method based on inter-task correlation, ensuring a diverse selection while preserving relative model rankings at a fraction of the computational cost.

    Pricing

    Free to use - open benchmark published February 2025.

    Surveys

    Loading more......

    Information

    Websitearxiv.org
    PublishedMar 13, 2026

    Categories

    1 Item
    Benchmarks & Evaluation

    Tags

    3 Items
    #Benchmark#Multilingual#Evaluation

    Similar Products

    6 result(s)
    MTEB Leaderboard
    Featured

    Massive Text Embedding Benchmark leaderboard covering 58 datasets across 112 languages and 8 embedding tasks. Industry-standard benchmark for comparing text embedding models.

    MTEB (Massive Text Embedding Benchmark)

    Comprehensive benchmark suite for evaluating embedding models across 58 datasets spanning 112 languages and eight task types including retrieval, clustering, and semantic similarity, the standard for comparing embedding quality.

    SISAP Indexing Challenge

    An annual competition focused on similarity search and indexing algorithms, including approximate nearest neighbor methods and high-dimensional vector indexing, providing benchmarks and results relevant to vector database research.

    BEIR

    BEIR (Benchmarking IR) is a benchmark suite for evaluating information retrieval and vector search systems across multiple tasks and datasets. Useful for comparing vector database performance.

    ANN-Benchmarks

    ANN-Benchmarks is a benchmarking platform specifically for evaluating the performance of approximate nearest neighbor (ANN) search algorithms, which are foundational to vector database evaluation and comparison.

    IntelLabs's Vector Search Datasets

    A collection of datasets curated by Intel Labs specifically for evaluating and benchmarking vector search algorithms and databases.

    Decorative pattern
    Built with
    Ever Works
    Ever Works

    Connect with us

    Stay Updated

    Get the latest updates and exclusive content delivered to your inbox.

    Product

    • Categories
    • Tags
    • Pricing
    • Help

    Clients

    • Sign In
    • Register
    • Forgot password?

    Company

    • About Us
    • Admin
    • Sitemap

    Resources

    • Blog
    • Submit
    • API Documentation
    All product names, logos, and brands are the property of their respective owners. All company, product, and service names used in this repository, related repositories, and associated websites are for identification purposes only. The use of these names, logos, and brands does not imply endorsement, affiliation, or sponsorship. This directory may include content generated by artificial intelligence.
    Copyright © 2025 Awesome Vector Databases. All rights reserved.·Terms of Service·Privacy Policy·Cookies