Gradient Deconfliction via Orthogonal Projections onto Subspaces For Multi-task Learning

📅 2025-03-05

🏛️ Web Search and Data Mining

📈 Citations: 1

✨ Influential: 0

🤖 AI Summary

In multi-task learning (MTL), gradient conflicts among tasks often degrade performance relative to single-task models. To address this, we propose GradOPS, a method that orthogonally projects each task’s gradient onto the subspace spanned by the gradients of all other tasks—thereby systematically eliminating gradient conflicts and enabling non-conflicting, controllable task trade-offs. We provide the first theoretical analysis establishing that gradient non-conflictness is both necessary and sufficient for achieving Pareto-optimal weighting strategies. GradOPS jointly achieves global conflict suppression and diverse Pareto-optimal solution discovery, with provable convergence guarantees. Extensive experiments across heterogeneous benchmarks demonstrate that GradOPS consistently outperforms state-of-the-art MTL methods, efficiently generating multiple high-quality Pareto-optimal solutions and supporting flexible, preference-aware task customization.

Technology Category

Application Category

📝 Abstract

Although multi-task learning (MTL) has been a preferred approach and successfully applied in many real-world scenarios, MTL models are not guaranteed to outperform single-task models on all tasks mainly due to the negative effects of conflicting gradients among the tasks. In this paper, we fully examine the influence of conflicting gradients and further emphasize the importance and advantages of achieving non-conflicting gradients which allows simple but effective trade-off strategies among the tasks with stable performance. Based on our findings, we propose the Gradient Deconfliction via Orthogonal Projections onto Subspaces (GradOPS) spanned by other task-specific gradients. Our method not only solves all conflicts among the tasks, but can also effectively search for diverse solutions towards different trade-off preferences among the tasks. Theoretical analysis on convergence is provided, and performance of our algorithm is fully testified on multiple benchmarks in various domains. Results demonstrate that our method can effectively find multiple state-of-the-art solutions with different trade-off strategies among the tasks on multiple datasets.

Problem

Research questions and friction points this paper is trying to address.

Addresses conflicting gradients in multi-task learning models.

Proposes Gradient Deconfliction via Orthogonal Projections (GradOPS).

Enhances performance and trade-off strategies across tasks.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Orthogonal projections for gradient deconfliction

Subspaces spanned by task-specific gradients

Multiple state-of-the-art trade-off solutions

🔎 Similar Papers

Can Optimization Trajectories Explain Multi-Task Transfer?

2024-08-26arXiv.orgCitations: 1

Authors to Follow