



Training technique in CSRv2 that stabilizes sparsity learning by gradually increasing sparsity constraints, reducing dead neurons from >80% to ~20%.
Progressive K-Annealing is a training technique used in CSRv2 that stabilizes sparsity learning by gradually increasing the sparsity constraint (reducing k) during training.
Instead of starting with ultra-sparse representations (k=2 or k=4), the training begins with higher k values and progressively reduces k over the training process. This allows the model to first learn good dense representations before being compressed.
The annealing schedule typically:
CSRv2 with progressive k-annealing achieves up to 300x improvements in compute and memory efficiency relative to dense embeddings and 7x speedup over Matryoshka Representation Learning.
Research technique, open-source implementation.
Loading more......