🤖 AI Summary
This paper addresses the challenge of inaccurate online estimation of object kinematic models under visual uncertainties—such as occlusion, low-texture regions, and image noise. We propose a probabilistic real-time estimation framework that explicitly incorporates human hand motion as a structural prior into online kinematic learning, tightly integrating observation uncertainty modeling with rigid-body physical constraints. Our method is built upon a probabilistic graphical model, unifying real-time hand–object co-tracking, Bayesian state estimation, and uncertainty-aware optimization to ensure robustness under severe visual disturbances. Evaluated on a newly constructed challenging benchmark dataset, our approach achieves 195% and 140% improvements in kinematic structure estimation accuracy over two state-of-the-art baselines, respectively. The resulting precise and reliable kinematic estimates enable safe, fine-grained robotic manipulation of centimeter-scale objects.
📝 Abstract
Visual uncertainties such as occlusions, lack of texture, and noise present significant challenges in obtaining accurate kinematic models for safe robotic manipulation. We introduce a probabilistic real-time approach that leverages the human hand as a prior to mitigate these uncertainties. By tracking the constrained motion of the human hand during manipulation and explicitly modeling uncertainties in visual observations, our method reliably estimates an object's kinematic model online. We validate our approach on a novel dataset featuring challenging objects that are occluded during manipulation and offer limited articulations for perception. The results demonstrate that by incorporating an appropriate prior and explicitly accounting for uncertainties, our method produces accurate estimates, outperforming two recent baselines by 195% and 140%, respectively. Furthermore, we demonstrate that our approach's estimates are precise enough to allow a robot to manipulate even small objects safely.