Rethinking Invariance in In-context Learning

📅 2025-05-08
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
In-context learning (ICL) in autoregressive large language models is sensitive to the ordering of demonstration examples—lacking permutation invariance—which undermines robustness and practical applicability. Existing invariant ICL methods improve order robustness but often degrade performance and struggle to simultaneously satisfy two critical constraints: information non-leakage and contextual interdependence. Method: We propose Invariant ICL (InvICL), the first approach to achieve truly permutation-invariant ICL via an autoregressive reconstruction framework that jointly incorporates a context-aware masking mechanism and a symmetric aggregation structure, enabling explicit modeling of inter-example dependencies while enforcing strict information isolation. Contribution/Results: Across multiple benchmarks, InvICL consistently outperforms standard ICL and all prior invariant ICL methods, demonstrating significantly enhanced generalization—particularly under variable-length input settings, where it maintains robust performance.

Technology Category

Application Category

📝 Abstract
In-Context Learning (ICL) has emerged as a pivotal capability of auto-regressive large language models, yet it is hindered by a notable sensitivity to the ordering of context examples regardless of their mutual independence. To address this issue, recent studies have introduced several variant algorithms of ICL that achieve permutation invariance. However, many of these do not exhibit comparable performance with the standard auto-regressive ICL algorithm. In this work, we identify two crucial elements in the design of an invariant ICL algorithm: information non-leakage and context interdependence, which are not simultaneously achieved by any of the existing methods. These investigations lead us to the proposed Invariant ICL (InvICL), a methodology designed to achieve invariance in ICL while ensuring the two properties. Empirically, our findings reveal that InvICL surpasses previous models, both invariant and non-invariant, in most benchmark datasets, showcasing superior generalization capabilities across varying input lengths. Code is available at https://github.com/PKU-ML/InvICL.
Problem

Research questions and friction points this paper is trying to address.

ICL suffers from sensitivity to example ordering
Existing invariant ICL methods lack performance parity
Current methods fail to achieve non-leakage and interdependence
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces InvICL for permutation invariance
Ensures information non-leakage and context interdependence
Outperforms previous models in benchmark datasets
🔎 Similar Papers
No similar papers found.
L
Lizhe Fang
State Key Lab of General Artificial Intelligence, School of Intelligence Science and Technology, Peking University
Y
Yifei Wang
MIT CSAIL
Khashayar Gatmiry
Khashayar Gatmiry
Graduate student, Massachusetts Institute of Technology
Machine LearningSamplingOptimization
L
Lei Fang
School of Economics, Peking University
Yisen Wang
Yisen Wang
Assistant Professor, Peking University
Machine LearningSelf-Supervised LearningLarge Language ModelsSafety