A Systematic Study of Pseudo-Relevance Feedback with LLMs

📅 2026-03-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the common conflation of feedback sources and feedback models in existing pseudo-relevance feedback (PRF) research by conducting controlled experiments to systematically disentangle their individual contributions. For the first time, five large language model (LLM)-driven PRF methods are independently evaluated across 13 low-resource BEIR tasks, clearly distinguishing and quantifying the separate impact of each component on retrieval effectiveness. The findings reveal that the choice of feedback model critically influences PRF performance, that using LLM-generated text alone offers the best cost–effectiveness, and that leveraging strongly retrieved initial documents as the feedback source yields optimal results. This work provides clear empirical evidence and actionable guidance for designing effective PRF mechanisms in the era of large language models.

Technology Category

Application Category

📝 Abstract
Pseudo-relevance feedback (PRF) methods built on large language models (LLMs) can be organized along two key design dimensions: the feedback source, which is where the feedback text is derived from and the feedback model, which is how the given feedback text is used to refine the query representation. However, the independent role that each dimension plays is unclear, as both are often entangled in empirical evaluations. In this paper, we address this gap by systematically studying how the choice of feedback source and feedback model impact PRF effectiveness through controlled experimentation. Across 13 low-resource BEIR tasks with five LLM PRF methods, our results show: (1) the choice of feedback model can play a critical role in PRF effectiveness; (2) feedback derived solely from LLM-generated text provides the most cost-effective solution; and (3) feedback derived from the corpus is most beneficial when utilizing candidate documents from a strong first-stage retriever. Together, our findings provide a better understanding of which elements in the PRF design space are most important.
Problem

Research questions and friction points this paper is trying to address.

pseudo-relevance feedback
large language models
feedback source
feedback model
information retrieval
Innovation

Methods, ideas, or system contributions that make the work stand out.

pseudo-relevance feedback
large language models
feedback source
feedback model
information retrieval