nomic-embed-text-v2-moe

Multilingual MoE text embedding model excelling at multilingual retrieval with SoTA performance compared to ~300M parameter models, supporting ~100 languages with Matryoshka Embeddings trained on 1.6B pairs.

Visit Website

Overview

nomic-embed-text-v2-moe is a multilingual MoE (Mixture of Experts) text embedding model that excels at multilingual retrieval. It offers high performance with state-of-the-art multilingual performance compared to ~300M parameter models.

Key Features

Multilingual Support

Supports approximately 100 languages, providing robust multilingual, cross-lingual, and code retrieval capabilities.

Training

Trained on over 1.6 billion pairs, ensuring comprehensive coverage across languages and domains.

Matryoshka Embeddings

Supports flexible embedding dimensions through Matryoshka Embeddings, allowing you to truncate vectors to smaller dimensions without retraining.

Performance

Offers state-of-the-art multilingual performance compared to models with around 300M parameters, making it highly efficient for its size.

Local Deployment

Available through Ollama, allowing you to:

Run the model completely offline
Ensure complete data privacy
Eliminate internet dependency for processing documents
Deploy on your own infrastructure

Use Cases

Multilingual semantic search
Cross-lingual information retrieval
International document processing
Code search across multiple programming languages
RAG systems for multilingual content

Integration

Available through Ollama for easy local deployment
Compatible with various embedding frameworks
Can be used with Chroma, Milvus, and other vector databases

Pricing

Free and open-source, runs locally through Ollama.

Surveys

Loading more......

Information

Websiteollama.com

PublishedMar 13, 2026

Tags

3 Items

#embeddings #multilingual #local

Similar Products

Qwen3 Embedding

Multilingual embedding model supporting over 100 languages and ranking #1 on MTEB multilingual leaderboard. Offers flexible model sizes from 0.6B to 8B parameters with user-defined instructions.

000

MTEB

Massive Text Embedding Benchmark (MTEB) - a comprehensive benchmark for evaluating text embedding models across 8 embedding tasks and 58 datasets in 112 languages. Provides a standardized leaderboard for comparing embedding quality across classification, clustering, retrieval, reranking, semantic textual similarity, and summarization tasks.

000

Cohere Embed Multilingual v3

High-performance multilingual embedding model from Cohere supporting 100+ languages with 1024 dimensions, optimized for semantic search, RAG, and cross-lingual retrieval tasks.

000

Mistral Embed

State-of-the-art embedding model from Mistral AI that generates 1024-dimensional vectors for text, supporting semantic search, clustering, and retrieval-augmented generation applications.

000

BGE-M3

A versatile multilingual text embedding model from BAAI that supports 100+ languages and can handle inputs up to 8192 tokens. BGE-M3 is unique in supporting three retrieval methods simultaneously: dense retrieval, multi-vector retrieval, and sparse retrieval.

000

gte-Qwen2-1.5B-instruct

A state-of-the-art multilingual text embedding model from Alibaba's GTE (General Text Embedding) series, built on the Qwen2-1.5B LLM. The model supports up to 8192 tokens and incorporates bidirectional attention mechanisms for enhanced contextual understanding across diverse domains.

000

Overview

Key Features

Multilingual Support

Supports approximately 100 languages, providing robust multilingual, cross-lingual, and code retrieval capabilities.

Training

Trained on over 1.6 billion pairs, ensuring comprehensive coverage across languages and domains.

Matryoshka Embeddings

Supports flexible embedding dimensions through Matryoshka Embeddings, allowing you to truncate vectors to smaller dimensions without retraining.

Performance

Offers state-of-the-art multilingual performance compared to models with around 300M parameters, making it highly efficient for its size.

Local Deployment

Available through Ollama, allowing you to:

Run the model completely offline
Ensure complete data privacy
Eliminate internet dependency for processing documents
Deploy on your own infrastructure

Use Cases

Multilingual semantic search
Cross-lingual information retrieval
International document processing
Code search across multiple programming languages
RAG systems for multilingual content

Integration

Available through Ollama for easy local deployment
Compatible with various embedding frameworks
Can be used with Chroma, Milvus, and other vector databases

Pricing

Free and open-source, runs locally through Ollama.

nomic-embed-text-v2-moe

Overview

Key Features

Multilingual Support

Training

Matryoshka Embeddings

Performance

Local Deployment

Use Cases

Integration

Pricing

Information

Categories

Tags

Similar Products

Connect with us

Stay Updated

Product

Clients

Company

Resources

nomic-embed-text-v2-moe

Overview

Key Features

Multilingual Support

Training

Matryoshka Embeddings

Performance

Local Deployment

Use Cases

Integration

Pricing

Information

Categories

Tags

Similar Products