Exploring Information Processing in Large Language Models: Insights from Information Bottleneck Theory

📅 2025-01-02

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the challenge of efficient inference and adaptation for large language models (LLMs), grounded in information bottleneck theory. We propose a novel task-semantic space modeling paradigm: first, a non-parametric, training-free method to construct task-specific semantic subspaces—such as sentiment and topic—by compressing inputs into interpretable, low-dimensional representations; second, two principled techniques—Information-Compressed In-Context Learning (IC-ICL) and Task-Space-Guided Fine-Tuning (TS-FT). IC-ICL improves inference accuracy while accelerating inference by over 40%, without introducing additional parameters. TS-FT achieves state-of-the-art performance with minimal parameter updates. Extensive evaluation across multiple benchmarks demonstrates the generalizability and effectiveness of the learned task spaces. Our framework advances both the theoretical understanding and practical deployment of LLMs, offering a principled pathway toward interpretable, parameter-efficient adaptation.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) have demonstrated remarkable performance across a wide range of tasks by understanding input information and predicting corresponding outputs. However, the internal mechanisms by which LLMs comprehend input and make effective predictions remain poorly understood. In this paper, we explore the working mechanism of LLMs in information processing from the perspective of Information Bottleneck Theory. We propose a non-training construction strategy to define a task space and identify the following key findings: (1) LLMs compress input information into specific task spaces (e.g., sentiment space, topic space) to facilitate task understanding; (2) they then extract and utilize relevant information from the task space at critical moments to generate accurate predictions. Based on these insights, we introduce two novel approaches: an Information Compression-based Context Learning (IC-ICL) and a Task-Space-guided Fine-Tuning (TS-FT). IC-ICL enhances reasoning performance and inference efficiency by compressing retrieved example information into the task space. TS-FT employs a space-guided loss to fine-tune LLMs, encouraging the learning of more effective compression and selection mechanisms. Experiments across multiple datasets validate the effectiveness of task space construction. Additionally, IC-ICL not only improves performance but also accelerates inference speed by over 40%, while TS-FT achieves superior results with a minimal strategy adjustment.

Problem

Research questions and friction points this paper is trying to address.

Large Language Models

Information Processing

Efficiency and Speed

Innovation

Methods, ideas, or system contributions that make the work stand out.

Information Bottleneck Theory

IC-ICL

TS-FT

🔎 Similar Papers

No similar papers found.

Authors to Follow