Rethinking Invariance in In-context Learning

📅 2025-05-08

📈 Citations: 1

✨ Influential: 0

🤖 AI Summary

In-context learning (ICL) in autoregressive large language models is sensitive to the ordering of demonstration examples—lacking permutation invariance—which undermines robustness and practical applicability. Existing invariant ICL methods improve order robustness but often degrade performance and struggle to simultaneously satisfy two critical constraints: information non-leakage and contextual interdependence. Method: We propose Invariant ICL (InvICL), the first approach to achieve truly permutation-invariant ICL via an autoregressive reconstruction framework that jointly incorporates a context-aware masking mechanism and a symmetric aggregation structure, enabling explicit modeling of inter-example dependencies while enforcing strict information isolation. Contribution/Results: Across multiple benchmarks, InvICL consistently outperforms standard ICL and all prior invariant ICL methods, demonstrating significantly enhanced generalization—particularly under variable-length input settings, where it maintains robust performance.

Technology Category

Application Category

📝 Abstract

In-Context Learning (ICL) has emerged as a pivotal capability of auto-regressive large language models, yet it is hindered by a notable sensitivity to the ordering of context examples regardless of their mutual independence. To address this issue, recent studies have introduced several variant algorithms of ICL that achieve permutation invariance. However, many of these do not exhibit comparable performance with the standard auto-regressive ICL algorithm. In this work, we identify two crucial elements in the design of an invariant ICL algorithm: information non-leakage and context interdependence, which are not simultaneously achieved by any of the existing methods. These investigations lead us to the proposed Invariant ICL (InvICL), a methodology designed to achieve invariance in ICL while ensuring the two properties. Empirically, our findings reveal that InvICL surpasses previous models, both invariant and non-invariant, in most benchmark datasets, showcasing superior generalization capabilities across varying input lengths. Code is available at https://github.com/PKU-ML/InvICL.

Problem

Research questions and friction points this paper is trying to address.

ICL suffers from sensitivity to example ordering

Existing invariant ICL methods lack performance parity

Current methods fail to achieve non-leakage and interdependence

Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces InvICL for permutation invariance

Ensures information non-leakage and context interdependence

Outperforms previous models in benchmark datasets

🔎 Similar Papers

No similar papers found.

Authors to Follow