Exploring Information Processing in Large Language Models: Insights from Information Bottleneck Theory

📅 2025-01-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of efficient inference and adaptation for large language models (LLMs), grounded in information bottleneck theory. We propose a novel task-semantic space modeling paradigm: first, a non-parametric, training-free method to construct task-specific semantic subspaces—such as sentiment and topic—by compressing inputs into interpretable, low-dimensional representations; second, two principled techniques—Information-Compressed In-Context Learning (IC-ICL) and Task-Space-Guided Fine-Tuning (TS-FT). IC-ICL improves inference accuracy while accelerating inference by over 40%, without introducing additional parameters. TS-FT achieves state-of-the-art performance with minimal parameter updates. Extensive evaluation across multiple benchmarks demonstrates the generalizability and effectiveness of the learned task spaces. Our framework advances both the theoretical understanding and practical deployment of LLMs, offering a principled pathway toward interpretable, parameter-efficient adaptation.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) have demonstrated remarkable performance across a wide range of tasks by understanding input information and predicting corresponding outputs. However, the internal mechanisms by which LLMs comprehend input and make effective predictions remain poorly understood. In this paper, we explore the working mechanism of LLMs in information processing from the perspective of Information Bottleneck Theory. We propose a non-training construction strategy to define a task space and identify the following key findings: (1) LLMs compress input information into specific task spaces (e.g., sentiment space, topic space) to facilitate task understanding; (2) they then extract and utilize relevant information from the task space at critical moments to generate accurate predictions. Based on these insights, we introduce two novel approaches: an Information Compression-based Context Learning (IC-ICL) and a Task-Space-guided Fine-Tuning (TS-FT). IC-ICL enhances reasoning performance and inference efficiency by compressing retrieved example information into the task space. TS-FT employs a space-guided loss to fine-tune LLMs, encouraging the learning of more effective compression and selection mechanisms. Experiments across multiple datasets validate the effectiveness of task space construction. Additionally, IC-ICL not only improves performance but also accelerates inference speed by over 40%, while TS-FT achieves superior results with a minimal strategy adjustment.
Problem

Research questions and friction points this paper is trying to address.

Large Language Models
Information Processing
Efficiency and Speed
Innovation

Methods, ideas, or system contributions that make the work stand out.

Information Bottleneck Theory
IC-ICL
TS-FT
🔎 Similar Papers
No similar papers found.
Z
Zhou Yang
College of Computer and Data Science, Fuzhou University, Fuzhou, China
Z
Zhengyu Qi
Leiden University, Leiden, The Netherlands
Zhaochun Ren
Zhaochun Ren
Leiden University
Information retrievalNatural language processing
Z
Zhikai Jia
SCITIX (SGP) TECH PTE. LTD, Singapore
H
Haizhou Sun
SmartMore, Shenzhen, China
Xiaofei Zhu
Xiaofei Zhu
Chongqing University of Technology
Computer Science
X
Xiangwen Liao
College of Computer and Data Science, Fuzhou University, Fuzhou, China