Inlay

Rescaling MLM-Head for Neural Sparse Retrieval Finds that pretrained encoders with large MLM-head scales face degradation in sparse retrieval, and introduces a zero-cost rescaling correction to stabilize training. 📝 arxiv.org/abs/2606.18811

Learned sparse retrieval (LSR) models such as SPLADE have traditionally used BERT-style masked language models as backbone encoders. A natural expectation is that replacing BERT with stronger pretrain...