Effective Inference-Free Retrieval for Learned Sparse Representations

📅 2025-04-30

📈 Citations: 0

✨ Influential: 0

career value

213K/year

🤖 AI Summary

This paper addresses the query encoding bottleneck in learned sparse retrieval (LSR), where real-time neural inference impedes efficiency. We propose a model-free query processing paradigm that decouples query encoding into token-level score lookups—eliminating on-the-fly encoding overhead entirely. Methodologically, we relax conventional regularization constraints and instead co-optimize token-level score modeling with inverted index construction, thereby preserving computational efficiency while enhancing generalization. Experiments demonstrate state-of-the-art performance: +1.0 mRR@10 over Splade-v3-Doc on MS MARCO and +1.8 nDCG@10 on the BEIR cross-domain benchmark. Our core contribution is the first realization of “zero-inference” query encoding in sparse retrieval—achieving unprecedented synergy among efficiency, effectiveness, and strong cross-domain generalization.

Technology Category

Application Category

📝 Abstract

Learned Sparse Retrieval (LSR) is an effective IR approach that exploits pre-trained language models for encoding text into a learned bag of words. Several efforts in the literature have shown that sparsity is key to enabling a good trade-off between the efficiency and effectiveness of the query processor. To induce the right degree of sparsity, researchers typically use regularization techniques when training LSR models. Recently, new efficient -- inverted index-based -- retrieval engines have been proposed, leading to a natural question: has the role of regularization changed in training LSR models? In this paper, we conduct an extended evaluation of regularization approaches for LSR where we discuss their effectiveness, efficiency, and out-of-domain generalization capabilities. We first show that regularization can be relaxed to produce more effective LSR encoders. We also show that query encoding is now the bottleneck limiting the overall query processor performance. To remove this bottleneck, we advance the state-of-the-art of inference-free LSR by proposing Learned Inference-free Retrieval (Li-LSR). At training time, Li-LSR learns a score for each token, casting the query encoding step into a seamless table lookup. Our approach yields state-of-the-art effectiveness for both in-domain and out-of-domain evaluation, surpassing Splade-v3-Doc by 1 point of mRR@10 on MS MARCO and 1.8 points of nDCG@10 on BEIR.

Problem

Research questions and friction points this paper is trying to address.

Evaluating regularization approaches for Learned Sparse Retrieval (LSR) models

Addressing query encoding as bottleneck in LSR performance

Proposing inference-free LSR (Li-LSR) for improved retrieval effectiveness

Innovation

Methods, ideas, or system contributions that make the work stand out.

Relaxed regularization enhances LSR encoder effectiveness

Query encoding identified as performance bottleneck

Li-LSR enables inference-free retrieval via token scores

🔎 Similar Papers

No similar papers found.