🤖 AI Summary
This paper addresses the base pose selection problem for mobile manipulator grasping from single-frame RGB-D images. We propose an efficient, geometry-aware two-stage curriculum learning method. In Stage I, large-scale weakly supervised annotations are automatically generated using heuristic rules—such as visibility distance—to encode geometric priors. In Stage II, the model is fine-tuned in high-fidelity simulation to bridge the gap between geometric heuristics and real-world grasp success. Our architecture employs a PointNet++-style point cloud encoder jointly trained with an MLP to directly regress dense scores over candidate base poses, bypassing explicit motion planning or full task execution. Experiments demonstrate that our approach significantly outperforms baselines relying solely on proximity or geometric features, both in simulation and on physical hardware. It achieves superior safety, reachability, and robustness—particularly in graceful degradation under prediction errors.
📝 Abstract
GBPP is a fast learning based scorer that selects a robot base pose for grasping from a single RGB-D snapshot. The method uses a two stage curriculum: (1) a simple distance-visibility rule auto-labels a large dataset at low cost; and (2) a smaller set of high fidelity simulation trials refines the model to match true grasp outcomes. A PointNet++ style point cloud encoder with an MLP scores dense grids of candidate poses, enabling rapid online selection without full task-and-motion optimization. In simulation and on a real mobile manipulator, GBPP outperforms proximity and geometry only baselines, choosing safer and more reachable stances and degrading gracefully when wrong. The results offer a practical recipe for data efficient, geometry aware base placement: use inexpensive heuristics for coverage, then calibrate with targeted simulation.