🤖 AI Summary
Heterogeneous feature distributions in distributed and federated learning impede effective model aggregation and lead to slow convergence.
Method: This paper introduces the energy distance—a theoretically grounded statistical divergence measure—for quantifying feature distribution heterogeneity. We propose a Taylor-series-based approximation algorithm that preserves theoretical fidelity while significantly reducing computational overhead. Building upon this metric, we design an adaptive penalty-weighting mechanism that aligns gradient or model updates via predictive alignment, integrating distributed statistical inference with weight calibration.
Contribution/Results: The method ensures rigorous theoretical guarantees without compromising practicality. Extensive experiments demonstrate that our approach substantially accelerates convergence and improves accuracy across diverse heterogeneous data settings, while exhibiting strong robustness and generalization capability.
📝 Abstract
In distributed and federated learning, heterogeneity across data sources remains a major obstacle to effective model aggregation and convergence. We focus on feature heterogeneity and introduce energy distance as a sensitive measure for quantifying distributional discrepancies. While we show that energy distance is robust for detecting data distribution shifts, its direct use in large-scale systems can be prohibitively expensive. To address this, we develop Taylor approximations that preserve key theoretical quantitative properties while reducing computational overhead. Through simulation studies, we show how accurately capturing feature discrepancies boosts convergence in distributed learning. Finally, we propose a novel application of energy distance to assign penalty weights for aligning predictions across heterogeneous nodes, ultimately enhancing coordination in federated and distributed settings.