🤖 AI Summary
Learning dexterous manipulation is hindered by the high cost of acquiring high-quality demonstration data and the low sample efficiency of reinforcement learning exploration. This work proposes the EaDex framework, which leverages a monocular RGB-D camera to capture human hand motions and combines MANO modeling, data normalization, and motion retargeting to construct structured, low-cost demonstrations. To bridge the gap between imitation and autonomous policy improvement, the method introduces a contact-aware reward-based dynamic demonstration annealing mechanism that enables a smooth transition from demonstration-guided learning to self-optimization. Evaluated on three object-opening tasks across nine cross-embodiment settings, the approach achieves a 55.3% higher success rate compared to a non-annealing baseline, demonstrating its effectiveness and strong generalization capability across diverse robotic hands.
📝 Abstract
Dexterous manipulation learning has long been hindered by the high costs of data and training, as pure reinforcement learning typically requires large-scale interactive exploration and imitation learning depends on high-quality demonstrations that are expensive to collect. To address this problem, we propose EaDex, a multi-embodiment dexterous manipulation learning framework under low-cost demonstration conditions, which enables rapid generation of demonstration data and consequently reduces training time for efficient dexterous manipulation. At the data level, EaDex captures human hand motions using only a single RGB-D camera and constructs structured demonstration data through MANO-based hand modeling, data normalization, and motion retargeting. At the learning level, we introduce a contact-reward-based dynamic demonstration annealing mechanism, which guides early-stage exploration under demonstration and gradually transitions to autonomous optimization with accumulating contact rewards. Using our custom dataset, we evaluate EaDex on three dexterous hands and three articulated object-opening tasks, covering nine cross-embodiment manipulation settings, achieving a 55.3% relative improvement over the baseline without demonstration annealing. These results validate the effectiveness of the proposed low-cost demonstration pipeline and the dynamic demonstration annealing strategy for dexterous manipulation learning.