



A retrieval paradigm where query and document encodings are kept separate until a late interaction stage, enabling more expressive and efficient similarity computations. Pioneered by ColBERT and extended by ColPali and ColQwen, this approach maintains fine-grained representations while enabling fast retrieval.
Loading more......
Late Interaction Retrieval is a retrieval paradigm where query and document encodings are computed independently and kept separate until a "late interaction" stage. This contrasts with traditional dense retrieval where queries and documents are encoded into single vectors.
Instead of encoding text into a single vector:
The key operation in late interaction is MaxSim:
Major vector databases supporting late interaction:
Concept implementations vary; models and databases have individual pricing.