



A lightweight, fast Python library for embedding generation using ONNX Runtime that achieves 12x inference speedup on CPUs, requires no GPU, and provides state-of-the-art accuracy with Flag Embedding as the default model, maintained by Qdrant.
Loading more......
FastEmbed is a lightweight, fast library for embedding generation built and maintained by Qdrant. It uses ONNX Runtime instead of PyTorch, making it ideal for CPU-only environments and serverless deployments.
Available in:
pip install fastembedNative integration with Qdrant vector database:
from fastembed import TextEmbedding
from qdrant_client import QdrantClient
embedding = TextEmbedding()
client = QdrantClient(":memory:")
vectors = list(embedding.embed(["Hello world"]))
# Install
pip install fastembed
# Basic usage
from fastembed import TextEmbedding
model = TextEmbedding()
embeddings = list(model.embed(["Hello world"]))
Completely free and open-source. No API costs, no usage limits.