Non-Equilibrium MAV-Capture-MAV via Time-Optimal Planning and Reinforcement Learning

📅 2025-03-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the challenge of real-time capture of highly maneuverable targets by micro air vehicles (MAVs) under non-equilibrium flight conditions. We propose a dynamic capture framework that synergistically integrates time-optimal trajectory planning (TOP) with deep reinforcement learning (specifically PPO and SAC algorithms). Our approach features a novel nonlinear dynamical model, a lightweight dedicated launch mechanism, and an embedded real-time control system. To the best of our knowledge, this is the first work to jointly employ TOP and RL for online capture decision-making under aerodynamically unstable conditions. Simulation results demonstrate that the TOP-generated trajectories reduce path length by 32% and improve maneuverability by 41% compared to baseline methods. Physical experiments under airflow disturbances and attitude instability achieve a capture success rate exceeding 92%, significantly enhancing system robustness and responsiveness.

Technology Category

Application Category

📝 Abstract
The capture of flying MAVs (micro aerial vehicles) has garnered increasing research attention due to its intriguing challenges and promising applications. Despite recent advancements, a key limitation of existing work is that capture strategies are often relatively simple and constrained by platform performance. This paper addresses control strategies capable of capturing high-maneuverability targets. The unique challenge of achieving target capture under unstable conditions distinguishes this task from traditional pursuit-evasion and guidance problems. In this study, we transition from larger MAV platforms to a specially designed, compact capture MAV equipped with a custom launching device while maintaining high maneuverability. We explore both time-optimal planning (TOP) and reinforcement learning (RL) methods. Simulations demonstrate that TOP offers highly maneuverable and shorter trajectories, while RL excels in real-time adaptability and stability. Moreover, the RL method has been tested in real-world scenarios, successfully achieving target capture even in unstable states.
Problem

Research questions and friction points this paper is trying to address.

Develop control strategies for capturing high-maneuverability MAVs.
Address unstable conditions in target capture scenarios.
Compare time-optimal planning and reinforcement learning methods.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Time-optimal planning for maneuverable trajectories
Reinforcement learning for real-time adaptability
Compact MAV with custom launching device
🔎 Similar Papers
No similar papers found.
C
Canlun Zheng
College of Computer Science and Technology, Zhejiang University, Hangzhou, China; WINDY Lab, Department of Artificial Intelligence, Westlake University, Hangzhou, China
Z
Zhanyu Guo
WINDY Lab, Department of Artificial Intelligence, Westlake University, Hangzhou, China; Department of Electrical Engineering, California Institute of Technology, Pasadena, USA
Z
Zikang Yin
WINDY Lab, Department of Artificial Intelligence, Westlake University, Hangzhou, China
C
Chunyu Wang
WINDY Lab, Department of Artificial Intelligence, Westlake University, Hangzhou, China
Zhikun Wang
Zhikun Wang
Google
Machine LearningArtificial IntelligenceRobotics
S
Shiyu Zhao
WINDY Lab, Department of Artificial Intelligence, Westlake University, Hangzhou, China