



Universal text embedding model from NVIDIA achieving state-of-the-art performance on MMTEB leaderboard, optimized for retrieval, reranking, semantic similarity, and classification with 4,096-dimensional embeddings.
Loading more......
llama-embed-nemotron-8b is a versatile text embedding model trained by NVIDIA and optimized for retrieval, reranking, semantic similarity, and classification use cases. It achieves state-of-the-art performance on the Multilingual Massive Text Embedding Benchmark (MMTEB) leaderboard as of October 21, 2025.
The model consists of:
Robust capabilities for multilingual and cross-lingual text retrieval, designed to serve as a foundational component in text-based Retrieval-Augmented Generation (RAG) systems.
A universal, instruction-tuned text embedding model designed to generate specialized embeddings for a wide range of tasks, including retrieval, classification, and semantic textual similarity (STS).
The complete dataset consists of 4.3 million samples from a diverse range of corpora:
Achieved 62% Top-1 accuracy, the highest among all tested embedding models in comparative benchmarks.
Free to use under NVIDIA AI Foundation Models license.