🤖 AI Summary
Large language models (LLMs) suffer performance degradation in in-context learning (ICL) due to misalignment between their inherent priors and the empirical class distribution of in-context examples.
Method: We propose a dynamic bias calibration framework grounded in implicit sequential Bayesian inference. Crucially, we introduce *surprise*—defined as the discrepancy between model output entropy and predictive confidence—as a lightweight, query-level signal for detecting class prior drift. This enables adaptive, context-aware estimation of class priors without explicit parametric modeling of prior distributions.
Contribution/Results: Our approach achieves significant improvements over state-of-the-art fixed-prior calibration methods across multiple NLP benchmarks. It enhances ICL robustness to distributional shifts and improves cross-task generalization, all while incurring negligible computational overhead. To our knowledge, this is the first work to leverage surprise as an implicit indicator of prior drift for dynamic, entropy- and confidence-based class bias estimation in ICL.
📝 Abstract
In-context learning (ICL) has emerged as a powerful paradigm for task adaptation in large language models (LLMs), where models infer underlying task structures from a few demonstrations. However, ICL remains susceptible to biases that arise from prior knowledge and contextual demonstrations, which can degrade the performance of LLMs. Existing bias calibration methods typically apply fixed class priors across all inputs, limiting their efficacy in dynamic ICL settings where the context for each query differs. To address these limitations, we adopt implicit sequential Bayesian inference as a framework for interpreting ICL, identify"surprise"as an informative signal for class prior shift, and introduce a novel method--Surprise Calibration (SC). SC leverages the notion of surprise to capture the temporal dynamics of class priors, providing a more adaptive and computationally efficient solution for in-context learning. We empirically demonstrate the superiority of SC over existing bias calibration techniques across a range of benchmark natural language processing tasks.