• Home
  • Categories
  • Tags
  • Pricing
  • Submit
    Decorative pattern
    1. Home
    2. Machine Learning Models
    3. ModernBERT Embed

    ModernBERT Embed

    Open-source embedding model from Nomic AI based on ModernBERT-base with 149M parameters. Supports 8192 token sequences and Matryoshka Representation Learning for 3x memory reduction.

    🌐Visit Website

    About this tool

    Overview

    ModernBERT Embed is an embedding model trained from ModernBERT-base, bringing the new advances of ModernBERT to embeddings. Trained on the Nomic Embed weakly-supervised and supervised datasets.

    Key Features

    • Based on ModernBERT-base with 149M parameters
    • Outperforms both nomic-embed-text-v1 and nomic-embed-text-v1.5 on MTEB
    • Maximum sequence length of 8192 tokens
    • Supports Matryoshka Representation Learning dimensions of 256
    • 2 valid output dimensionalities: 768 and 256
    • 3x memory reduction with minimal performance loss using dimension 256

    Usage Requirements

    This model requires prefixes to be added to the input (similar to Nomic Embed):

    • Add search_query: prefix to queries
    • Add search_document: prefix to documents

    Applications

    • Semantic search
    • Classification
    • Clustering
    • Reranking tasks

    License

    Apache 2.0 licensed - fully commercially permissible

    Requirements

    Requires transformers>=4.48.0

    Availability

    Available on Hugging Face at nomic-ai/modernbert-embed-base

    Surveys

    Loading more......

    Information

    Websitehuggingface.co
    PublishedMar 8, 2026

    Categories

    1 Item
    Machine Learning Models

    Tags

    3 Items
    #Open Source#Embeddings#Nlp

    Similar Products

    6 result(s)
    ColBERT
    Featured

    Late interaction architecture for efficient and effective passage search. Encodes queries and documents independently using BERT, then performs token-level similarity via maxsim operator for strong generalization.

    Qwen3 Embedding
    Featured

    Multilingual embedding model supporting over 100 languages and ranking #1 on MTEB multilingual leaderboard. Offers flexible model sizes from 0.6B to 8B parameters with user-defined instructions.

    all-MiniLM-L6-v2
    Featured

    A compact and efficient pre-trained sentence embedding model, widely used for generating vector representations of text. It's a popular choice for applications requiring fast and accurate semantic search, often integrated with vector databases.

    GTE Embeddings

    General Text Embeddings from Alibaba DAMO Academy trained on large-scale relevance pairs. Available in three sizes (large, base, small) with GTE-v1.5 supporting 8192 context length.

    SentenceTransformer
    Featured

    A Python library for generating high-quality sentence, text, and image embeddings. It simplifies the process of converting text into dense vector representations, which are fundamental for similarity search and storage in vector databases.

    Haystack

    An open-source NLP framework for building end-to-end search systems, which can leverage vector search capabilities.

    Decorative pattern
    Built with
    Ever Works
    Ever Works

    Connect with us

    Stay Updated

    Get the latest updates and exclusive content delivered to your inbox.

    Product

    • Categories
    • Tags
    • Pricing
    • Help

    Clients

    • Sign In
    • Register
    • Forgot password?

    Company

    • About Us
    • Admin
    • Sitemap

    Resources

    • Blog
    • Submit
    • API Documentation
    All product names, logos, and brands are the property of their respective owners. All company, product, and service names used in this repository, related repositories, and associated websites are for identification purposes only. The use of these names, logos, and brands does not imply endorsement, affiliation, or sponsorship. This directory may include content generated by artificial intelligence.
    Copyright © 2025 Awesome Vector Databases. All rights reserved.·Terms of Service·Privacy Policy·Cookies