



A research paper introducing lightweight, learning-free binary embeddings for fast retrieval. The approach uses isolation kernels to generate binary embeddings that dramatically reduce storage requirements (32× compression) while maintaining retrieval quality.
Loading more......
Published in January 2026 (arXiv:2601.09159), this paper presents a novel approach to generating binary embeddings using isolation kernels—a lightweight, learning-free method that achieves dramatic compression while preserving retrieval quality.
Unlike neural approaches that require training:
Isolation kernels measure similarity by how easily points can be separated:
Traditional binary embeddings require:
Isolation kernel approach:
Compression: 32× reduction in storage
Speed: 40× faster similarity computation (combining storage and compute benefits)
Quality: Maintains competitive retrieval accuracy despite extreme compression
Scalability: Particularly effective for billion-scale datasets
Binary embeddings involve a quality vs. efficiency trade-off:
vs. Product Quantization: More aggressive compression, simpler implementation
vs. Neural Binary Embeddings: No training required, faster to deploy
vs. Full-Precision: Much faster and smaller, some accuracy sacrifice
Demonstrates that advanced mathematical techniques (isolation kernels) can achieve compression competitive with learned methods, opening new avenues for efficient vector search.
Published as arXiv preprint arXiv:2601.09159 (2026). The paper includes algorithmic details and experimental validation.