🤖 AI Summary
In multi-task learning (MTL), outlier tasks—those exhibiting adversarial or uninformative relationships with the primary task—degrade overall model performance. To address this, we propose a robust gradient boosting framework that uniquely integrates task heterogeneity modeling with gradient-level optimization. Our method employs regularization-driven task clustering to automatically partition tasks without prior labels; within the shared representation space, it identifies and suppresses gradient interference from outlier tasks while enabling fine-grained, group-specific adaptation for semantically related tasks. The framework operates end-to-end, jointly performing outlier detection, knowledge transfer, and robust optimization. Evaluated on synthetic and multiple real-world MTL benchmarks, it reduces average prediction error by up to 32%, significantly improving cross-task generalization and training stability. Our approach establishes a novel, interpretable, and scalable paradigm for heterogeneous MTL.
📝 Abstract
Multi-task learning (MTL) has shown effectiveness in exploiting shared information across tasks to improve generalization. MTL assumes tasks share similarities that can improve performance. In addition, boosting algorithms have demonstrated exceptional performance across diverse learning problems, primarily due to their ability to focus on hard-to-learn instances and iteratively reduce residual errors. This makes them a promising approach for learning multi-task problems. However, real-world MTL scenarios often involve tasks that are not well-aligned (known as outlier or adversarial tasks), which do not share beneficial similarities with others and can, in fact, deteriorate the performance of the overall model. To overcome this challenge, we propose Robust-Multi-Task Gradient Boosting (R-MTGB), a novel boosting framework that explicitly models and adapts to task heterogeneity during training. R-MTGB structures the learning process into three sequential blocks: (1) learning shared patterns, (2) partitioning tasks into outliers and non-outliers with regularized parameters, and (3) fine-tuning task-specific predictors. This architecture enables R-MTGB to automatically detect and penalize outlier tasks while promoting effective knowledge transfer among related tasks. Our method integrates these mechanisms seamlessly within gradient boosting, allowing robust handling of noisy or adversarial tasks without sacrificing accuracy. Extensive experiments on both synthetic benchmarks and real-world datasets demonstrate that our approach successfully isolates outliers, transfers knowledge, and consistently reduces prediction errors for each task individually, and achieves overall performance gains across all tasks. These results highlight robustness, adaptability, and reliable convergence of R-MTGB in challenging MTL environments.