Surprise Calibration for Better In-Context Learning

📅 2025-06-15

📈 Citations: 0

✨ Influential: 0

career value

174K/year

🤖 AI Summary

Large language models (LLMs) suffer performance degradation in in-context learning (ICL) due to misalignment between their inherent priors and the empirical class distribution of in-context examples. Method: We propose a dynamic bias calibration framework grounded in implicit sequential Bayesian inference. Crucially, we introduce *surprise*—defined as the discrepancy between model output entropy and predictive confidence—as a lightweight, query-level signal for detecting class prior drift. This enables adaptive, context-aware estimation of class priors without explicit parametric modeling of prior distributions. Contribution/Results: Our approach achieves significant improvements over state-of-the-art fixed-prior calibration methods across multiple NLP benchmarks. It enhances ICL robustness to distributional shifts and improves cross-task generalization, all while incurring negligible computational overhead. To our knowledge, this is the first work to leverage surprise as an implicit indicator of prior drift for dynamic, entropy- and confidence-based class bias estimation in ICL.

Technology Category

Application Category

📝 Abstract

In-context learning (ICL) has emerged as a powerful paradigm for task adaptation in large language models (LLMs), where models infer underlying task structures from a few demonstrations. However, ICL remains susceptible to biases that arise from prior knowledge and contextual demonstrations, which can degrade the performance of LLMs. Existing bias calibration methods typically apply fixed class priors across all inputs, limiting their efficacy in dynamic ICL settings where the context for each query differs. To address these limitations, we adopt implicit sequential Bayesian inference as a framework for interpreting ICL, identify"surprise"as an informative signal for class prior shift, and introduce a novel method--Surprise Calibration (SC). SC leverages the notion of surprise to capture the temporal dynamics of class priors, providing a more adaptive and computationally efficient solution for in-context learning. We empirically demonstrate the superiority of SC over existing bias calibration techniques across a range of benchmark natural language processing tasks.

Problem

Research questions and friction points this paper is trying to address.

Address biases in in-context learning from prior knowledge

Improve dynamic class prior adaptation in ICL

Enhance computational efficiency for bias calibration

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses implicit sequential Bayesian inference

Identifies surprise for class prior shift

Introduces Surprise Calibration for dynamics

🔎 Similar Papers

The Over-Certainty Phenomenon in Modern UDA Algorithms