Masked Sensory-Temporal Attention for Sensor Generalization in Quadruped Locomotion

📅 2024-09-05
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Quadrupedal robots face significant challenges in generalizing gait control across heterogeneous, missing, or structurally varying sensor configurations and body morphologies. Method: We propose a masked sensory-temporal attention mechanism built upon a lightweight Transformer architecture. This is the first approach to enable sensor-level fine-grained attention modeling, integrating dynamic sensor masking with cross-modal temporal attention to achieve robust state representation under variable input dimensions and severe sensor dropout (up to 70%). Contribution/Results: Evaluated in simulation and on diverse real-world quadrupeds (e.g., Unitree A1, Go2), our policy demonstrates strong cross-hardware transferability—requiring only a single training run to adapt seamlessly to differing sensor suites and mechanical designs. It maintains stable locomotion even under extreme input degradation, substantially improving the robustness and generalizability of learning-based locomotion policies for real-world deployment.

Technology Category

Application Category

📝 Abstract
With the rising focus on quadrupeds, a generalized policy capable of handling different robot models and sensory inputs will be highly beneficial. Although several methods have been proposed to address different morphologies, it remains a challenge for learning-based policies to manage various combinations of proprioceptive information. This paper presents Masked Sensory-Temporal Attention (MSTA), a novel transformer-based model with masking for quadruped locomotion. It employs direct sensor-level attention to enhance sensory-temporal understanding and handle different combinations of sensor data, serving as a foundation for incorporating unseen information. This model can effectively understand its states even with a large portion of missing information, and is flexible enough to be deployed on a physical system despite the long input sequence.
Problem

Research questions and friction points this paper is trying to address.

Generalized policy for different robot models and sensors
Handling various combinations of proprioceptive information
Effective state understanding with missing sensor data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformer-based mechanism with masking
Direct sensor-level attention for sensory-temporal understanding
Handles missing information and long input sequences
🔎 Similar Papers
No similar papers found.
Dikai Liu
Dikai Liu
University of Technology Sydney
Field Roboticsinfrastructure roboticshuman-robot collaboration
T
Tianwei Zhang
College of Computing and Data Science, Nanyang Technological University, Singapore
J
Jianxiong Yin
NVIDIA AI Technology Centre (NV AITC)
Simon See
Simon See
nvidia
applied mathematicsAImachine learningHigh Performance ComputingSimulation