This is a demo directory website built with Ever Works
RankGPT
LLM-based document reranking approach that fine-tunes decoder-only models like LLaMA to calculate query-document relevance scores. Uses generative capabilities of large language models to improve retrieval ranking in search and RAG systems.
Cross-Encoder Reranking
Two-stage retrieval where initial results from bi-encoder vector search are reranked using more expensive cross-encoder models for higher accuracy. Used in Hindsight and other systems.
Re2G
Retrieve, Rerank, Generate system from IBM Research that combines neural retrieval and reranking with BART-based generation, achieving 9-34% gains over previous SOTA on the KILT leaderboard.
BGE-reranker-v2-m3
Open-source multilingual reranking model from BAAI supporting 100+ languages with Apache 2.0 licensing, matching Cohere's latency on GPU with zero ongoing costs for production deployments.
Cross-Encoder
Neural reranking architecture that examines full query-document pairs simultaneously for deeper semantic understanding, achieving higher accuracy than bi-encoders at the cost of computational efficiency.
FlashRank
Ultra-lite and super-fast Python reranking library based on SoTA cross-encoders and LLMs, running on CPU with the tiniest reranking model in the world at ~4MB with no PyTorch dependency.
BGE Reranker Base
Open-source cross-encoder reranking model from BAAI that enhances RAG retrieval quality by examining query-document pairs individually. Self-hostable with Apache 2.0 licensing for cost-effective production deployments.
Cohere Rerank
Proprietary neural network reranker accessed via API that processes query and document together as a cross-encoder to precisely judge relevance. Supports over 100 languages with Rerank 3 Nimble variant for faster production performance.
Reranking Models
Cross-encoder models that rerank initial retrieval results for improved relevance. More accurate than bi-encoders but slower, typically applied to top-k candidates.
Page 1 of 83