🤖 AI Summary
Addressing the scarcity of accurately annotated data and severe domain shift between simulation and real-world field conditions in precision agriculture, this paper proposes a self-supervised synthetic data generation method integrating physics-based simulation with geometry-consistent image synthesis. We construct a high-fidelity vineyard simulation environment in Unity and synthesize multi-view, multi-illumination images with pixel-accurate annotations via geometrically constrained cut-and-paste operations and real-image registration. Our approach introduces the novel “simulation-to-real geometric alignment” paradigm, enabling fully automatic, scalable label generation while preserving ecological realism. Evaluated on table grape detection, training YOLOv8 exclusively on our synthetic data improves mAP by 12.3% over baselines, demonstrating substantial gains in model generalization and validating effective transfer from synthetic to real-world field scenarios.
📝 Abstract
In precision agriculture, the scarcity of labeled data and significant covariate shifts pose unique challenges for training machine learning models. This scarcity is particularly problematic due to the dynamic nature of the environment and the evolving appearance of agricultural subjects as living things. We propose a novel system for generating realistic synthetic data to address these challenges. Utilizing a vineyard simulator based on the Unity engine, our system employs a cut-and-paste technique with geometrical consistency considerations to produce accurate photo-realistic images and labels from synthetic environments to train detection algorithms. This approach generates diverse data samples across various viewpoints and lighting conditions. We demonstrate considerable performance improvements in training a state-of-the-art detector by applying our method to table grapes cultivation. The combination of techniques can be easily automated, an increasingly important consideration for adoption in agricultural practice.