6D Strawberry Pose Estimation: Real-time and Edge AI Solutions Using Purely Synthetic Training Data

📅 2025-11-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Strawberry 6D pose estimation for agricultural harvesting robots relies heavily on scarce and costly real-world annotated data, exacerbating challenges posed by labor shortages and high annotation expenses. Method: This paper proposes a lightweight, purely synthetic-data-driven solution: (i) a procedural synthetic data generation pipeline built in Blender to enhance image photorealism and diversity; and (ii) YOLOX-6D-Pose—a single-stage framework trained exclusively on synthetic data—enabling high-accuracy 6D pose estimation without any real-image supervision. Contribution/Results: To our knowledge, this is the first work to empirically validate the effectiveness of purely synthetic data for strawberry 6D pose estimation on resource-constrained edge devices (Jetson Orin Nano). The model achieves comparable ADD-S scores on both RTX 3090 and Orin Nano, sustains real-time inference (>15 FPS), and accurately estimates poses of mature strawberries—demonstrating strong feasibility for field deployment.

Technology Category

Application Category

📝 Abstract
Automated and selective harvesting of fruits has become an important area of research, particularly due to challenges such as high costs and a shortage of seasonal labor in advanced economies. This paper focuses on 6D pose estimation of strawberries using purely synthetic data generated through a procedural pipeline for photorealistic rendering. We employ the YOLOX-6D-Pose algorithm, a single-shot approach that leverages the YOLOX backbone, known for its balance between speed and accuracy, and its support for edge inference. To address the lacking availability of training data, we introduce a robust and flexible pipeline for generating synthetic strawberry data from various 3D models via a procedural Blender pipeline, where we focus on enhancing the realism of the synthesized data in comparison to previous work to make it a valuable resource for training pose estimation algorithms. Quantitative evaluations indicate that our models achieve comparable accuracy on both the NVIDIA RTX 3090 and Jetson Orin Nano across several ADD-S metrics, with the RTX 3090 demonstrating superior processing speed. However, the Jetson Orin Nano is particularly suited for resource-constrained environments, making it an excellent choice for deployment in agricultural robotics. Qualitative assessments further confirm the model's performance, demonstrating its capability to accurately infer the poses of ripe and partially ripe strawberries, while facing challenges in detecting unripe specimens. This suggests opportunities for future improvements, especially in enhancing detection capabilities for unripe strawberries (if desired) by exploring variations in color. Furthermore, the methodology presented could be adapted easily for other fruits such as apples, peaches, and plums, thereby expanding its applicability and impact in the field of agricultural automation.
Problem

Research questions and friction points this paper is trying to address.

Estimating 6D strawberry pose using purely synthetic training data
Addressing labor shortages through automated fruit harvesting solutions
Developing edge AI compatible models for agricultural robotics deployment
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses purely synthetic training data from Blender
Employs YOLOX-6D-Pose algorithm for real-time inference
Optimized for edge deployment on Jetson Orin Nano
🔎 Similar Papers
No similar papers found.
S
Saptarshi Neil Sinha
Fraunhofer IGD
J
Julius Kuhn
Fraunhofer IGD
M
Mika Silvan Goschke
Fraunhofer IGD
Michael Weinmann
Michael Weinmann
Delft University of Technology
Computer VisionComputer Graphics3D ReconstructionVirtual RealityMachine Learning