MuST: Multi-Head Skill Transformer for Long-Horizon Dexterous Manipulation with Skill Progress

📅 2025-02-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Addressing the challenges of multi-stage skill coordination and poor generalization of action sequences in long-horizon dexterous robotic manipulation (e.g., pick-and-place and tight packing), this paper proposes a Multi-Head Skill Transformer architecture. Methodologically, it introduces three key innovations: (1) a novel skill-level “progress value” mechanism enabling interpretable skill selection and smooth transitions; (2) dynamic skill expansion capability and adaptive subtask sequencing; and (3) integration of motion primitive learning with progress-guided skill execution, coupled with a simulation-to-reality transfer training strategy. Evaluated on both simulated and real robotic platforms, the approach achieves significant improvements in task success rate, supports longer skill chains, and demonstrates superior cross-task generalization—outperforming state-of-the-art methods across all metrics.

Technology Category

Application Category

📝 Abstract
Robot picking and packing tasks require dexterous manipulation skills, such as rearranging objects to establish a good grasping pose, or placing and pushing items to achieve tight packing. These tasks are challenging for robots due to the complexity and variability of the required actions. To tackle the difficulty of learning and executing long-horizon tasks, we propose a novel framework called the Multi-Head Skill Transformer (MuST). This model is designed to learn and sequentially chain together multiple motion primitives (skills), enabling robots to perform complex sequences of actions effectively. MuST introduces a"progress value"for each skill, guiding the robot on which skill to execute next and ensuring smooth transitions between skills. Additionally, our model is capable of expanding its skill set and managing various sequences of sub-tasks efficiently. Extensive experiments in both simulated and real-world environments demonstrate that MuST significantly enhances the robot's ability to perform long-horizon dexterous manipulation tasks.
Problem

Research questions and friction points this paper is trying to address.

Long-horizon dexterous manipulation tasks
Sequential chaining of motion primitives
Skill progress-guided task execution
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-Head Skill Transformer
Progress value guidance
Expanding skill set
🔎 Similar Papers
No similar papers found.
K
Kai Gao
Amazon Robotics, MA, USA; Department of Computer Science, Rutgers University, NJ, USA
F
Fan Wang
Amazon Robotics, MA, USA
E
Erica Aduh
Amazon Robotics, MA, USA
Dylan Randle
Dylan Randle
Amazon
Artificial IntelligenceMachine LearningRobotics
J
Jane Shi
Amazon Robotics, MA, USA