

Generalized long-context text representation and reranking models from Alibaba supporting 75 languages and context length up to 8192. Built on transformer++ encoder with RoPE and GLU for enhanced multilingual retrieval.
Alibaba's Tongyi Lab has introduced the GTE-Multilingual (mGTE) series which offers high performance, long-context handling, multilingual support, and elastic embedding, significantly improving retrieval and ranking efficiency.
The mGTE series includes new generalized text encoder, embedding and reranking models that support 75 languages and the context length of up to 8192 tokens.
The models are built upon the transformer++ encoder backbone (BERT + RoPE + GLU) as well as the vocabulary of XLM-R.
The text encoder outperforms the same-sized previous state-of-the-art XLM-R, while the embedding and reranker match the performance of large-sized state-of-the-art BGE-M3 models and achieve better results on long-context retrieval benchmarks.
The models are available on:
Supports integration with Milvus, LangChain, and other popular vector database and LLM frameworks.
Available as open-source models on Hugging Face or through Alibaba Cloud APIs with commercial pricing.
Loading more......