🤖 AI Summary
In RAG scenarios, supervised fine-tuning often induces catastrophic forgetting, degrading general-purpose capabilities. We identify that RAG fine-tuning significantly shifts the model’s semantic distribution—and demonstrate that this shift strongly correlates with forgetting. To address this, we propose SelfAug: a self-distribution alignment method that requires no external general-purpose instruction data. SelfAug explicitly preserves the original semantic structure during fine-tuning by enforcing distributional consistency of input-sequence logits via a lightweight, plug-and-play self-alignment loss. Its core contribution is the first mechanistic characterization of distribution shift in RAG fine-tuning, coupled with a novel, efficient loss design for implicit semantic regularization. Experiments across multiple RAG downstream tasks show that SelfAug substantially outperforms existing anti-forgetting methods—simultaneously improving task-specific performance and mitigating forgetting, thereby enhancing model generalization.
📝 Abstract
Recent advancements in large language models (LLMs) have revolutionized natural language processing through their remarkable capabilities in understanding and executing diverse tasks. While supervised fine-tuning, particularly in Retrieval-Augmented Generation (RAG) scenarios, effectively enhances task-specific performance, it often leads to catastrophic forgetting, where models lose their previously acquired knowledge and general capabilities. Existing solutions either require access to general instruction data or face limitations in preserving the model's original distribution. To overcome these limitations, we propose SelfAug, a self-distribution alignment method that aligns input sequence logits to preserve the model's semantic distribution, thereby mitigating catastrophic forgetting and improving downstream performance. Extensive experiments demonstrate that SelfAug achieves a superior balance between downstream learning and general capability retention. Our comprehensive empirical analysis reveals a direct correlation between distribution shifts and the severity of catastrophic forgetting in RAG scenarios, highlighting how the absence of RAG capabilities in general instruction tuning leads to significant distribution shifts during fine-tuning. Our findings not only advance the understanding of catastrophic forgetting in RAG contexts but also provide a practical solution applicable across diverse fine-tuning scenarios. Our code is publicly available at https://github.com/USTC-StarTeam/SelfAug.