Gradient Deconfliction via Orthogonal Projections onto Subspaces For Multi-task Learning

๐Ÿ“… 2025-03-05
๐Ÿ›๏ธ Web Search and Data Mining
๐Ÿ“ˆ Citations: 1
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
In multi-task learning (MTL), gradient conflicts among tasks often degrade performance relative to single-task models. To address this, we propose GradOPS, a method that orthogonally projects each taskโ€™s gradient onto the subspace spanned by the gradients of all other tasksโ€”thereby systematically eliminating gradient conflicts and enabling non-conflicting, controllable task trade-offs. We provide the first theoretical analysis establishing that gradient non-conflictness is both necessary and sufficient for achieving Pareto-optimal weighting strategies. GradOPS jointly achieves global conflict suppression and diverse Pareto-optimal solution discovery, with provable convergence guarantees. Extensive experiments across heterogeneous benchmarks demonstrate that GradOPS consistently outperforms state-of-the-art MTL methods, efficiently generating multiple high-quality Pareto-optimal solutions and supporting flexible, preference-aware task customization.

Technology Category

Application Category

๐Ÿ“ Abstract
Although multi-task learning (MTL) has been a preferred approach and successfully applied in many real-world scenarios, MTL models are not guaranteed to outperform single-task models on all tasks mainly due to the negative effects of conflicting gradients among the tasks. In this paper, we fully examine the influence of conflicting gradients and further emphasize the importance and advantages of achieving non-conflicting gradients which allows simple but effective trade-off strategies among the tasks with stable performance. Based on our findings, we propose the Gradient Deconfliction via Orthogonal Projections onto Subspaces (GradOPS) spanned by other task-specific gradients. Our method not only solves all conflicts among the tasks, but can also effectively search for diverse solutions towards different trade-off preferences among the tasks. Theoretical analysis on convergence is provided, and performance of our algorithm is fully testified on multiple benchmarks in various domains. Results demonstrate that our method can effectively find multiple state-of-the-art solutions with different trade-off strategies among the tasks on multiple datasets.
Problem

Research questions and friction points this paper is trying to address.

Addresses conflicting gradients in multi-task learning models.
Proposes Gradient Deconfliction via Orthogonal Projections (GradOPS).
Enhances performance and trade-off strategies across tasks.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Orthogonal projections for gradient deconfliction
Subspaces spanned by task-specific gradients
Multiple state-of-the-art trade-off solutions
๐Ÿ”Ž Similar Papers
S
Shijie Zhu
Alibaba Group, Beijing, China
H
Hui Zhao
Alibaba Group, Beijing, China
T
Tianshu Wu
Alibaba Group, Beijing, China
P
Pengjie Wang
Alibaba Group, Beijing, China
Hongbo Deng
Hongbo Deng
Google
Information RetrievalData MiningMachine LearningNatural Language Processing
J
Jian Xu
Alibaba Group, Beijing, China
B
Bo Zheng
Alibaba Group, Beijing, China