More is not always better? Enhancing Many-Shot In-Context Learning with Differentiated and Reweighting Objectives

📅 2025-01-07

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the performance degradation of large language models (LLMs) in many-shot in-context learning (ICL) as the number of demonstrations increases. We first identify suboptimality of the negative log-likelihood (NLL) objective and incremental noise accumulation as primary causes. To mitigate these issues, we propose DR-ICL—a novel method integrating global differential learning with a local advantage-driven dynamic reweighting mechanism. We further introduce MICLB, the first large-scale, multi-task many-shot ICL benchmark covering 1–350 shots and up to 8K context tokens, enabling systematic long-context synthesis and evaluation. Extensive experiments across seven NLP task categories and fifty datasets demonstrate consistent, significant improvements over state-of-the-art baselines—enhancing both in-domain and out-of-domain generalization. DR-ICL systematically surpasses zero-shot performance ceilings, achieving stable gains even at 350-shot settings.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) excel at few-shot in-context learning (ICL) without requiring parameter updates. However, as the number of ICL demonstrations increases from a few to many, performance tends to plateau and eventually decline. We identify two primary causes for this trend: the suboptimal negative log-likelihood (NLL) optimization objective and the incremental data noise. To address these issues, we introduce DR-ICL, a novel optimization method that enhances model performance through Differentiated Learning and advantage-based Reweighting objectives. Globally, DR-ICL utilizes differentiated learning to optimize the NLL objective, ensuring that many-shot performance surpasses zero-shot levels. Locally, it dynamically adjusts the weighting of many-shot demonstrations by leveraging cumulative advantages inspired by reinforcement learning, thereby improving generalization. This approach allows the model to handle varying numbers of shots effectively, mitigating the impact of noisy data. Recognizing the lack of multi-task datasets with diverse many-shot distributions, we develop the Many-Shot ICL Benchmark (MICLB)-a large-scale benchmark covering shot numbers from 1 to 350 within sequences of up to 8,000 tokens-for fine-tuning purposes. MICLB facilitates the evaluation of many-shot ICL strategies across seven prominent NLP tasks and 50 distinct datasets. Experimental results demonstrate that LLMs enhanced with DR-ICL achieve significant improvements in many-shot setups across various tasks, including both in-domain and out-of-domain scenarios. We release the code and benchmark dataset hoping to facilitate further research in many-shot ICL.

Problem

Research questions and friction points this paper is trying to address.

Large Language Models

Optimization Objectives

Data Efficiency

Innovation

Methods, ideas, or system contributions that make the work stand out.

DR-ICL

Many-Shot ICL Benchmark

Adaptive Learning

🔎 Similar Papers

No similar papers found.

Authors to Follow