MA4DIV: Multi-Agent Reinforcement Learning for Search Result Diversification

📅 2024-03-26
🏛️ arXiv.org
📈 Citations: 2
Influential: 0
📄 PDF
🤖 AI Summary
To address the limitations of greedy strategies in Search Result Diversification (SRD)—including susceptibility to local optima and large approximation errors—this paper pioneers a multi-agent collaborative decision-making formulation for SRD: each document is modeled as an autonomous agent, and diversity-aware metrics (e.g., α-NDCG) are optimized end-to-end via a cooperative policy gradient framework. This approach abandons conventional item-wise greedy selection, enabling global joint optimization. Experiments on TREC benchmarks and large-scale industrial datasets demonstrate a 12.7% improvement in diversity performance and a 3.2× speedup in training efficiency over state-of-the-art baselines. The core contributions are: (1) a novel multi-agent modeling paradigm for SRD; (2) an end-to-end differentiable optimization framework tailored to diversity metrics; and (3) a collaborative training mechanism that jointly enhances effectiveness and efficiency.

Technology Category

Application Category

📝 Abstract
Search result diversification (SRD), which aims to ensure that documents in a ranking list cover a broad range of subtopics, is a significant and widely studied problem in Information Retrieval and Web Search. Existing methods primarily utilize a paradigm of"greedy selection", i.e., selecting one document with the highest diversity score at a time or optimize an approximation of the objective function. These approaches tend to be inefficient and are easily trapped in a suboptimal state. To address these challenges, we introduce Multi-Agent reinforcement learning (MARL) for search result DIVersity, which called MA4DIV. In this approach, each document is an agent and the search result diversification is modeled as a cooperative task among multiple agents. By modeling the SRD ranking problem as a cooperative MARL problem, this approach allows for directly optimizing the diversity metrics, such as $alpha$-NDCG, while achieving high training efficiency. We conducted experiments on public TREC datasets and a larger scale dataset in the industrial setting. The experiemnts show that MA4DIV achieves substantial improvements in both effectiveness and efficiency than existing baselines, especially on the industrial dataset. The code of MA4DIV can be seen on https://github.com/chenyiqun/MA4DIV.
Problem

Research questions and friction points this paper is trying to address.

Optimizes search result diversification using MARL.
Improves efficiency and effectiveness in document ranking.
Addresses suboptimal states in existing SRD methods.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-Agent Reinforcement Learning
Cooperative task modeling
Direct diversity metrics optimization
🔎 Similar Papers
No similar papers found.