🤖 AI Summary
To address the challenge of achieving precise six-degree-of-freedom (6-DOF) end-effector (EE) pose tracking in task space for wheeled quadrupedal manipulator robots, this paper proposes a deep reinforcement learning (DRL)-based whole-body coordinated control framework. The method directly implements closed-loop 6D pose control in task space—bypassing hierarchical planning or inverse kinematics decomposition. Its key contributions are: (1) a nonlinear Reward Fusion Module (RFM) that explicitly models the multi-stage coupling among base motion, manipulator operation, and balance maintenance; and (2) a teacher–student hierarchical RL training paradigm to mitigate motion–balance coupling under high kinematic redundancy. Extensive simulation and real-robot experiments demonstrate smooth, robust tracking performance, achieving mean position error < 5 cm and orientation error < 0.1 rad—setting a new state-of-the-art.
📝 Abstract
In this paper, we study the whole-body loco-manipulation problem using reinforcement learning (RL). Specifically, we focus on the problem of how to coordinate the floating base and the robotic arm of a wheeled-quadrupedal manipulator robot to achieve direct six-dimensional (6D) end-effector (EE) pose tracking in task space. Different from conventional whole-body loco-manipulation problems that track both floating-base and end-effector commands, the direct EE pose tracking problem requires inherent balance among redundant degrees of freedom in the whole-body motion. We leverage RL to solve this challenging problem. To address the associated difficulties, we develop a novel reward fusion module (RFM) that systematically integrates reward terms corresponding to different tasks in a nonlinear manner. In such a way, the inherent multi-stage and hierarchical feature of the loco-manipulation problem can be carefully accommodated. By combining the proposed RFM with the a teacher-student RL training paradigm, we present a complete RL scheme to achieve 6D EE pose tracking for the wheeled-quadruped manipulator robot. Extensive simulation and hardware experiments demonstrate the significance of the RFM. In particular, we enable smooth and precise tracking performance, achieving state-of-the-art tracking position error of less than 5 cm, and rotation error of less than 0.1 rd.