ASMA: An Adaptive Safety Margin Algorithm for Vision-Language Drone Navigation via Scene-Aware Control Barrier Functions

📅 2024-09-16

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

175K/year

🤖 AI Summary

To address the challenge of real-time obstacle avoidance and safety assurance for unmanned aerial vehicles (UAVs) executing vision-language navigation (VLN) under natural language instructions, this paper proposes a scene-aware adaptive safety boundary algorithm. The method introduces a novel depth-map-driven control barrier function (CBF), tightly integrating RGB-D sensing, CLIP-based language understanding, and YOLO-based object detection to enable dynamic identification of moving obstacles and online adaptation of safety margins—overcoming the limitations of conventional static safety constraints. The approach is optimized in ROS/Gazebo simulation and deployed on a Parrot Bebop2 UAV. Experimental results demonstrate that, compared to a CBF-free baseline, the proposed method improves task success rate by 59.4%–61.8%, with only a marginal increase in trajectory length (5.4%–8.2%), while ensuring real-time recovery from hazardous states.

Technology Category

Application Category

📝 Abstract

In the rapidly evolving field of vision-language navigation (VLN), ensuring robust safety mechanisms remains an open challenge. Control barrier functions (CBFs) are efficient tools which guarantee safety by solving an optimal control problem. In this work, we consider the case of a teleoperated drone in a VLN setting, and add safety features by formulating a novel scene-aware CBF using ego-centric observations obtained through an RGB-D sensor. As a baseline, we implement a vision-language understanding module which uses the contrastive language image pretraining (CLIP) model to query about a user-specified (in natural language) landmark. Using the YOLO (You Only Look Once) object detector, the CLIP model is queried for verifying the cropped landmark, triggering downstream navigation. To improve navigation safety of the baseline, we propose ASMA -- an Adaptive Safety Margin Algorithm -- that crops the drone's depth map for tracking moving object(s) to perform scene-aware CBF evaluation on-the-fly. By identifying potential risky observations from the scene, ASMA enables real-time adaptation to unpredictable environmental conditions, ensuring optimal safety bounds on a VLN-powered drone actions. Using the robot operating system (ROS) middleware on a parrot bebop2 quadrotor in the gazebo environment, ASMA offers 59.4% - 61.8% increase in success rates with insignificant 5.4% - 8.2% increases in trajectory lengths compared to the baseline CBF-less VLN while recovering from unsafe situations.

Problem

Research questions and friction points this paper is trying to address.

Ensures safe navigation for vision-language operated drones.

Proposes Adaptive Safety Margin Algorithm for real-time hazard avoidance.

Integrates scene-aware Control Barrier Functions with Model Predictive Control.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Scene-aware CBF with RGB-D camera observations

Adaptive Safety Margin Algorithm (ASMA)

Integration of ASMA within MPC framework

🔎 Similar Papers

Reactive Collision Avoidance for Safe Agile Navigation