



Evaluates embeddings on 58 datasets/112 languages with retrieval/clustering metrics for vector DB model selection via nDCG/Recall throughput proxies. Features 8 task types for comprehensive perf eval. Standard for RAG embedding choice; text-focused unlike BigVectorBench multimodal, complements ANN-Benchmarks index benchmarks.
MTEB (Massive Text Embedding Benchmark) is a comprehensive benchmark suite for evaluating embedding models across diverse NLP tasks like retrieval, classification, clustering, reranking, and semantic similarity.
MTEB directly uses the open source benchmark BEIR in its retrieval part, which contains 15 datasets covering:
MTEB maintains a public leaderboard showing:
Top models include:
from mteb import MTEB
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("model-name")
evaluation = MTEB(tasks=["Banking77Classification"])
results = evaluation.run(model)
Loading more......
Free and open-source benchmark framework.