🤖 AI Summary
Existing Adam optimizers suffer from weakened control over Pareto trade-offs in multi-objective optimization due to mismatches between weighting schemes and the underlying geometry of the problem. To address this, this work proposes MAdam—a plug-and-play wrapper for Adam that incorporates preference-conditioned curvature to precondition update directions and performs parameter updates in a whitened space. This approach is the first to simultaneously resolve both weighting and geometric mismatches, enabling metric-aware adaptive updates without modifying the base solver or optimizer. MAdam is compatible with loss-balancing, gradient-balancing, and Pareto-based solvers, and consistently outperforms standard Adam across all solver families in diverse applications, including multi-task learning, Pareto front recovery, physics-informed neural networks, and medical imaging.
📝 Abstract
Multi-objective optimization (MOO) underlies many machine learning problems, yet MOO solvers across the loss-balancing, gradient-balancing, and Pareto-based families almost universally hand their reconciled directions to Adam~\cite{kingma2015adam}. We show this coupling introduces two systematic gaps between the solver's intent and the optimizer's execution. The first is a \emph{weighting mismatch}: Adam's second-moment denominator entangles the time-varying preference vector with gradient statistics, marginalizing the preference into a history average and collapsing distinct Pareto trade-offs toward a near-uniform mixture. The second is a \emph{geometric mismatch}: Adam's adaptive metric distorts the Euclidean geometry MOO solvers assume, turning aligned objectives into apparent conflicts. To resolve both jointly, we introduce \textbf{MAdam} (Metric-Aware Multi-Objective Adam), a drop-in wrapper that leaves both solver and optimizer unchanged. MAdam preconditions the reconciled direction by the preference-conditioned curvature of the scalarized objective; on this whitened input, Adam's second moment collapses to identity, so the realized update is governed by the preference-conditioned metric. Across multi-task learning, Pareto-front recovery, physics-informed neural networks, and medical imaging, MAdam consistently improves over Adam for every solver family.