🤖 AI Summary
This study systematically investigates the impact of three types of noise—label errors, grammatical errors, and spelling errors—on model performance and internal mechanisms during fine-tuning of large language models. Focusing on GPT-2, Qwen2, and Llama-2 across diverse NLP tasks, the authors employ controlled noise injection, inter-layer representation analysis, attention visualization, and cross-model comparison to reveal, for the first time, how noise propagates and exerts localized effects from the perspectives of representation learning and attention mechanisms. The findings indicate that label noise substantially degrades performance, whereas grammatical and spelling noise occasionally exhibit mild regularization effects. Moreover, the influence of noise is primarily confined to task-relevant layers, with the overall attention structure of the models remaining largely stable.
📝 Abstract
Fine-tuning is the dominant paradigm for adapting pretrained large language models (LLMs) to downstream NLP tasks. In practice, fine-tuning datasets may contain various forms of noise arising from annotation errors, preprocessing artifacts, or automated data collection. While prior work has focused on designing robust learning algorithms to mitigate performance degradation under noisy conditions, comparatively little is known about how different types of noise affect the internal learning dynamics of LLMs during fine-tuning. In this work, we systematically study the impact of noise on model behavior across three pretrained model families (GPT-2, Qwen2 and Llama-2) and three diverse NLP tasks. We introduce controlled perturbations corresponding to three common real-world noise types: label noise, grammatical noise, and typographical noise. Beyond task-level performance, we analyze layer-wise representation changes and attention patterns to understand how noise propagates through the network. Our results show that corrupting labels (i.e. label noise) consistently causes the largest performance degradation, whereas grammatical noise and typographical noise can occasionally yield mild regularization benefits. We further find that noise effects are localized primarily to task-specific layers, while attention structures remain comparatively stable.