🤖 AI Summary
To address privacy preservation challenges in ship detection from multi-source heterogeneous satellite imagery, this paper proposes a federated learning (FL)-based distributed collaborative modeling framework that avoids centralized sharing of raw images and sensitive annotation data. We systematically evaluate the adaptability of four FL algorithms—FedAvg, FedProx, FedOpt, and FedMedian—in cross-domain satellite ship detection for the first time, and introduce a joint configuration strategy for communication rounds and local training epochs. Leveraging YOLOv8, we build a distributed detection architecture that achieves 12.3%–18.7% mAP improvement on small-scale client datasets, approaching the performance of full centralized training. Our core contributions are: (i) empirical validation of FL’s efficacy in remote sensing ship detection; (ii) identification of critical trade-offs between detection accuracy and computational efficiency governed by algorithm selection and training configuration; and (iii) provision of a reproducible technical pathway for privacy-sensitive, multi-source remote sensing intelligence analysis.
📝 Abstract
We investigate the application of Federated Learning (FL) for ship detection across diverse satellite datasets, offering a privacy-preserving solution that eliminates the need for data sharing or centralized collection. This approach is particularly advantageous for handling commercial satellite imagery or sensitive ship annotations. Four FL models including FedAvg, FedProx, FedOpt, and FedMedian, are evaluated and compared to a local training baseline, where the YOLOv8 ship detection model is independently trained on each dataset without sharing learned parameters. The results reveal that FL models substantially improve detection accuracy over training on smaller local datasets and achieve performance levels close to global training that uses all datasets during the training. Furthermore, the study underscores the importance of selecting appropriate FL configurations, such as the number of communication rounds and local training epochs, to optimize detection precision while maintaining computational efficiency.