



Retrieval paradigm where query and document tokens are encoded separately and interactions computed at search time, combining efficiency of bi-encoders with expressiveness of cross-encoders.
Loading more......
Late Interaction is a retrieval paradigm where query and document are encoded independently into multiple vectors (one per token), and their interaction is computed efficiently at search time. This approach bridges the gap between fast bi-encoders and accurate cross-encoders.
ColBERT (Contextualized Late Interaction over BERT) is the most well-known late interaction model:
Stores multiple vectors per document:
Implemented in open-source libraries (ColBERT, RAGatouille, etc.)