Communication-Efficient Module-Wise Federated Learning for Grasp Pose Detection in Cluttered Environments

📅 2025-07-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the dual challenges of privacy preservation and communication overhead in grasp pose detection (GPD) within cluttered environments, this paper proposes a modular federated learning framework. Unlike conventional full-model synchronous updates, our approach introduces module-level learning dynamics analysis for the first time and designs a two-stage training protocol: only modules exhibiting slow convergence undergo frequent communication and aggregation, while others are fine-tuned locally. By incorporating modular model partitioning, dynamic communication scheduling, and partial model updates, the framework significantly reduces bandwidth consumption. On the GraspNet-1B dataset, it achieves higher accuracy than baselines—including FedAvg—under identical communication budgets. Real-world robotic experiments further demonstrate superior grasp success rates in cluttered scenes, validating its efficiency, practicality, and generalization capability.

Technology Category

Application Category

📝 Abstract
Grasp pose detection (GPD) is a fundamental capability for robotic autonomy, but its reliance on large, diverse datasets creates significant data privacy and centralization challenges. Federated Learning (FL) offers a privacy-preserving solution, but its application to GPD is hindered by the substantial communication overhead of large models, a key issue for resource-constrained robots. To address this, we propose a novel module-wise FL framework that begins by analyzing the learning dynamics of the GPD model's functional components. This analysis identifies slower-converging modules, to which our framework then allocates additional communication effort. This is realized through a two-phase process: a standard full-model training phase is followed by a communication-efficient phase where only the identified subset of slower-converging modules is trained and their partial updates are aggregated. Extensive experiments on the GraspNet-1B dataset demonstrate that our method outperforms standard FedAvg and other baselines, achieving higher accuracy for a given communication budget. Furthermore, real-world experiments on a physical robot validate our approach, showing a superior grasp success rate compared to baseline methods in cluttered scenes. Our work presents a communication-efficient framework for training robust, generalized GPD models in a decentralized manner, effectively improving the trade-off between communication cost and model performance.
Problem

Research questions and friction points this paper is trying to address.

Reducing communication overhead in federated learning for grasp pose detection
Addressing slow-converging modules in GPD model training
Improving trade-off between communication cost and model performance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Module-wise FL framework for GPD
Two-phase training with partial updates
Optimizes communication for slower-converging modules
🔎 Similar Papers
No similar papers found.