π€ AI Summary
To address low-quality participation, free-riding, and slow model convergence arising from agent self-interest in federated learning, this paper proposes an incentive-aware framework that explicitly models heterogeneous contribution efforts under data heterogeneity. It innovatively introduces the Wasserstein distance to quantify inter-agent heterogeneity in contribution effortβthe first such application in this context. A peer-prediction-based truthful reporting incentive mechanism is designed, and a two-stage Stackelberg game model is formulated, with rigorous proof of equilibrium existence. Theoretical analysis reconstructs the generalization error gap and establishes tighter convergence upper bounds. Extensive experiments on multiple real-world datasets demonstrate that the proposed method significantly accelerates global model convergence, improves final accuracy, effectively incentivizes high-quality data contributions, and mitigates malicious dropout and inefficient participation.
π Abstract
Federated learning (FL) provides a promising paradigm for facilitating collaboration between multiple clients that jointly learn a global model without directly sharing their local data. However, existing research suffers from two caveats: 1) From the perspective of agents, voluntary and unselfish participation is often assumed. But self-interested agents may opt out of the system or provide low-quality contributions without proper incentives; 2) From the mechanism designer's perspective, the aggregated models can be unsatisfactory as the existing game-theoretical federated learning approach for data collection ignores the potential heterogeneous effort caused by contributed data. To alleviate above challenges, we propose an incentive-aware framework for agent participation that considers data heterogeneity to accelerate the convergence process. Specifically, we first introduce the notion of Wasserstein distance to explicitly illustrate the heterogeneous effort and reformulate the existing upper bound of convergence. To induce truthful reporting from agents, we analyze and measure the generalization error gap of any two agents by leveraging the peer prediction mechanism to develop score functions. We further present a two-stage Stackelberg game model that formalizes the process and examines the existence of equilibrium. Extensive experiments on real-world datasets demonstrate the effectiveness of our proposed mechanism.