A Component-Based Survey of Interactions between Large Language Models and Multi-Armed Bandits

📅 2026-01-19

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the decision-making challenges faced by large language models (LLMs) in training, retrieval-augmented generation, and personalization, as well as the representational limitations of multi-armed bandits (MABs) in action definition and environment modeling. It presents the first systematic, component-level analysis of the bidirectional interaction mechanisms between LLMs and MABs. By establishing a component-wise analytical framework and conducting a comprehensive literature review, the study introduces a novel mutual enhancement paradigm—“LLM-enhanced MAB” and “MAB-enhanced LLM”—characterizing representative architectures, performance trends, and key challenges. To foster further research, the authors also release an open-access literature index repository.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) have become powerful and widely used systems for language understanding and generation, while multi-armed bandit (MAB) algorithms provide a principled framework for adaptive decision-making under uncertainty. This survey explores the potential at the intersection of these two fields. As we know, it is the first survey to systematically review the bidirectional interaction between large language models and multi-armed bandits at the component level. We highlight the bidirectional benefits: MAB algorithms address critical LLM challenges, spanning from pre-training to retrieval-augmented generation (RAG) and personalization. Conversely, LLMs enhance MAB systems by redefining core components such as arm definition and environment modeling, thereby improving decision-making in sequential tasks. We analyze existing LLM-enhanced bandit systems and bandit-enhanced LLM systems, providing insights into their design, methodologies, and performance. Key challenges and representative findings are identified to help guide future research. An accompanying GitHub repository that indexes relevant literature is available at https://github.com/bucky1119/Awesome-LLM-Bandit-Interaction.

Problem

Research questions and friction points this paper is trying to address.

Large Language Models

Multi-Armed Bandits

Component-Level Interaction

Bidirectional Integration

Adaptive Decision-Making

Innovation

Methods, ideas, or system contributions that make the work stand out.

Large Language Models

Multi-Armed Bandits

Component-Level Interaction

Retrieval-Augmented Generation

Sequential Decision-Making

🔎 Similar Papers

No similar papers found.

Authors to Follow