Label Smoothing Improves Gradient Ascent in LLM Unlearning

📅 2025-10-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address training instability and severe utility degradation in large language model (LLM) unlearning caused by gradient ascent (GA), this paper proposes Smoothed Gradient Ascent (SGA). SGA applies label smoothing to forget samples, theoretically derives the optimal smoothing coefficient, and jointly optimizes with multiple synthetically constructed normal samples to regularize gradient directions and enhance optimization stability. Crucially, SGA requires no architectural modifications or additional parameters. It significantly mitigates GA-induced oscillations and over-unlearning. Evaluated on three benchmarks—TOFU, Harry Potter, and MUSE-NEWS—SGA consistently outperforms vanilla GA across key metrics: forgetting completeness, retained utility, and training stability. Several metrics achieve state-of-the-art (SOTA) or second-best performance. To our knowledge, SGA is the first method enabling efficient, stable, and controllable selective unlearning in large language models.

Technology Category

Application Category

📝 Abstract
LLM unlearning has emerged as a promising approach, aiming to enable models to forget hazardous/undesired knowledge at low cost while preserving as much model utility as possible. Among existing techniques, the most straightforward method is performing Gradient Ascent (GA) w.r.t. the forget data, thereby forcing the model to unlearn the forget dataset. However, GA suffers from severe instability, as it drives updates in a divergent direction, often resulting in drastically degraded model utility. To address this issue, we propose Smoothed Gradient Ascent (SGA). SGA combines the forget data with multiple constructed normal data through a tunable smoothing rate. Intuitively, this extends GA from learning solely on the forget data to jointly learning across both forget and normal data, enabling more stable unlearning while better preserving model utility. Theoretically, we provide the theoretical guidance on the selection of the optimal smoothing rate. Empirically, we evaluate SGA on three benchmarks: TOFU, Harry Potter, and MUSE-NEWS. Experimental results demonstrate that SGA consistently outperforms the original Gradient Ascent (GA) method across all metrics and achieves top-2 performance among all baseline methods on several key metrics.
Problem

Research questions and friction points this paper is trying to address.

Addresses instability in gradient ascent for LLM unlearning
Proposes smoothed gradient ascent to preserve model utility
Enables stable forgetting while maintaining normal knowledge
Innovation

Methods, ideas, or system contributions that make the work stand out.

Smoothed Gradient Ascent combines forget and normal data
Tunable smoothing rate enables stable unlearning while preserving utility
Method extends gradient ascent to joint learning across datasets
Zirui Pang
Zirui Pang
University of Illinois Urbana-Champaign
Machine LearningUnlearningLabel Noise
H
Hao Zheng
Harbin Institute of Technology (Weihai)
Z
Zhijie Deng
The Hong Kong University of Science and Technology (Guangzhou)
L
Ling Li
The Hong Kong University of Science and Technology (Guangzhou)
Z
Zixin Zhong
The Hong Kong University of Science and Technology (Guangzhou)
J
Jiaheng Wei
The Hong Kong University of Science and Technology (Guangzhou)