🤖 AI Summary
Generating realistic and natural robotic dexterous hand manipulation data remains challenging due to interpenetration artifacts, interaction distortion, and insufficient motion diversity. To address these issues, we propose the first unified framework jointly modeling hand pose retargeting and object interaction. Our approach integrates multi-source human hand–object motion capture data, introduces a learnable contact graph to explicitly encode hand–object contact relationships, and employs a differentiable temporal loss to enforce motion continuity. Furthermore, physics-based constraints are incorporated for post-hoc pose refinement. The method achieves high pose accuracy while significantly improving naturalness and diversity of generated motions. On standard benchmarks, it reduces interpenetration error by 32.7% compared to prior work. This advancement provides a robust, high-fidelity data foundation for simulation-based training and embodied intelligent manipulation.
📝 Abstract
Despite advances in hand-object interaction modeling, generating realistic dexterous manipulation data for robotic hands remains a challenge. Retargeting methods often suffer from low accuracy and fail to account for hand-object interactions, leading to artifacts like interpenetration. Generative methods, lacking human hand priors, produce limited and unnatural poses. We propose a data transformation pipeline that combines human hand and object data from multiple sources for high-precision retargeting. Our approach uses a differential loss constraint to ensure temporal consistency and generates contact maps to refine hand-object interactions. Experiments show our method significantly improves pose accuracy, naturalness, and diversity, providing a robust solution for hand-object interaction modeling.