๐ค AI Summary
To address safety risks arising from sensor failures and inaccurate state estimation in UAVs under physical attacks (e.g., GPS spoofing), this paper proposes a privilege-free two-stage latent-state encoding framework. Methodologically, it introduces a teacherโstudent collaborative self-supervised representation learning mechanism that leverages historical sensor data to learn attack-aware, robust latent representations; these are integrated with model-free reinforcement learning for end-to-end interference-resilient control. The key contribution lies in the first unified formulation of attack-aware latent representation learning with lightweight deployment constraints, significantly enhancing generalization to unseen attack types. Experiments demonstrate a 32.7% improvement in safe flight success rate across diverse physical attack scenarios, a 41% reduction in training cost, and full independence from attack labels or auxiliary hardware.
๐ Abstract
Unmanned Aerial Vehicles (UAVs) depend on onboard sensors for perception, navigation, and control. However, these sensors are susceptible to physical attacks, such as GPS spoofing, that can corrupt state estimates and lead to unsafe behavior. While reinforcement learning (RL) offers adaptive control capabilities, existing safe RL methods are ineffective against such attacks. We present ARMOR (Adaptive Robust Manipulation-Optimized State Representations), an attack-resilient, model-free RL controller that enables robust UAV operation under adversarial sensor manipulation. Instead of relying on raw sensor observations, ARMOR learns a robust latent representation of the UAV's physical state via a two-stage training framework. In the first stage, a teacher encoder, trained with privileged attack information, generates attack-aware latent states for RL policy training. In the second stage, a student encoder is trained via supervised learning to approximate the teacher's latent states using only historical sensor data, enabling real-world deployment without privileged information. Our experiments show that ARMOR outperforms conventional methods, ensuring UAV safety. Additionally, ARMOR improves generalization to unseen attacks and reduces training cost by eliminating the need for iterative adversarial training.