Whole-Body Coordination for Dynamic Object Grasping with Legged Manipulators

📅 2025-08-10

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Addressing the challenge of whole-body coordinated dynamic grasping for quadrupedal robots in unstructured environments, this paper introduces DQ-Bench—the first systematic benchmark for dynamic grasping evaluation—and proposes DQ-Net, a teacher-student learning framework. DQ-Net jointly models static geometric and dynamic temporal features via dual-view representation, a grasp fusion module, and privileged information distillation, enabling closed-loop control from lightweight observations: target masks, depth maps, and proprioceptive states only. Experiments demonstrate that DQ-Net achieves significantly higher grasping success rates across diverse dynamic tasks compared to prior methods, while exhibiting superior responsiveness and cross-terrain robustness. This work establishes a reproducible evaluation standard and an efficient learning paradigm for embodied intelligent manipulation in dynamic scenarios.

Technology Category

Application Category

📝 Abstract

Quadrupedal robots with manipulators offer strong mobility and adaptability for grasping in unstructured, dynamic environments through coordinated whole-body control. However, existing research has predominantly focused on static-object grasping, neglecting the challenges posed by dynamic targets and thus limiting applicability in dynamic scenarios such as logistics sorting and human-robot collaboration. To address this, we introduce DQ-Bench, a new benchmark that systematically evaluates dynamic grasping across varying object motions, velocities, heights, object types, and terrain complexities, along with comprehensive evaluation metrics. Building upon this benchmark, we propose DQ-Net, a compact teacher-student framework designed to infer grasp configurations from limited perceptual cues. During training, the teacher network leverages privileged information to holistically model both the static geometric properties and dynamic motion characteristics of the target, and integrates a grasp fusion module to deliver robust guidance for motion planning. Concurrently, we design a lightweight student network that performs dual-viewpoint temporal modeling using only the target mask, depth map, and proprioceptive state, enabling closed-loop action outputs without reliance on privileged data. Extensive experiments on DQ-Bench demonstrate that DQ-Net achieves robust dynamic objects grasping across multiple task settings, substantially outperforming baseline methods in both success rate and responsiveness.

Problem

Research questions and friction points this paper is trying to address.

Dynamic object grasping with legged manipulators in unstructured environments

Lack of benchmarks for dynamic grasping across varied conditions

Need for robust perception and control in dynamic scenarios

Innovation

Methods, ideas, or system contributions that make the work stand out.

DQ-Bench benchmark evaluates dynamic grasping comprehensively

DQ-Net uses teacher-student framework for grasp inference

Lightweight student network enables closed-loop action outputs

🔎 Similar Papers

No similar papers found.

Authors to Follow