Multimodal embedding model with 137M parameters that outperforms OpenAI text-embedding-3-small on both short and long context tasks. Features Matryoshka Representation Learning for flexible embedding dimensions.
Loading more......
BGE-VL
State-of-the-art multimodal embedding model from BAAI supporting text-to-image, image-to-text, and compositional visual search. Trained on the MegaPairs dataset with over 26 million retrieval triplets.
Qwen3 Embedding
Multilingual embedding model supporting over 100 languages and ranking #1 on MTEB multilingual leaderboard. Offers flexible model sizes from 0.6B to 8B parameters with user-defined instructions.
Jina Embeddings v4
Universal multimodal embedding model from Jina AI supporting text and images through unified pathway. Built on Qwen2.5-VL-3B-Instruct, outperforms proprietary models on visually rich document retrieval. This is a commercial API with free tier, though OSS weights available.
BGE-M3
A versatile multilingual text embedding model from BAAI that supports 100+ languages and can handle inputs up to 8192 tokens. BGE-M3 is unique in supporting three retrieval methods simultaneously: dense retrieval, multi-vector retrieval, and sparse retrieval.
gte-Qwen2-1.5B-instruct
A state-of-the-art multilingual text embedding model from Alibaba's GTE (General Text Embedding) series, built on the Qwen2-1.5B LLM. The model supports up to 8192 tokens and incorporates bidirectional attention mechanisms for enhanced contextual understanding across diverse domains.
INSTRUCTOR
A task-specific text embedding model that generates customized embeddings based on natural language instructions. INSTRUCTOR achieves state-of-the-art performance on 70 diverse embedding tasks by allowing users to specify the task objective and domain.