An Imitative Reinforcement Learning Framework for Autonomous Dogfight

📅 2024-06-17
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
Unmanned Combat Aerial Vehicle (UCAV) close-range air combat suffers from limited exploration capability, low sample efficiency, and simulation-to-reality discrepancy, hindering effective autonomous policy learning. Method: We propose an imitation-reinforcement learning framework integrating expert demonstrations with autonomous exploration. Specifically, we develop a high-fidelity Harfang3D air combat simulator; initialize the policy via behavior cloning, then fine-tune it online using Proximal Policy Optimization (PPO); and enforce state-action alignment constraints alongside adversarial perturbation for enhanced robustness. Contribution/Results: Our work introduces the first synergistic imitation–reinforcement learning mechanism that jointly ensures efficient policy initialization and dynamic adaptability. Experiments demonstrate 100% success rate across the full “pursuit–lock-on–engagement” task pipeline—significantly outperforming pure RL or IL baselines—while exhibiting strong generalization and operational robustness under realistic conditions.

Technology Category

Application Category

📝 Abstract
Unmanned Combat Aerial Vehicle (UCAV) dogfight, which refers to a fight between two or more UCAVs usually at close quarters, plays a decisive role on the aerial battlefields. With the evolution of artificial intelligence, dogfight progressively transits towards intelligent and autonomous modes. However, the development of autonomous dogfight policy learning is hindered by challenges such as weak exploration capabilities, low learning efficiency, and unrealistic simulated environments. To overcome these challenges, this paper proposes a novel imitative reinforcement learning framework, which efficiently leverages expert data while enabling autonomous exploration. The proposed framework not only enhances learning efficiency through expert imitation, but also ensures adaptability to dynamic environments via autonomous exploration with reinforcement learning. Therefore, the proposed framework can learn a successful dogfight policy of 'pursuit-lock-launch' for UCAVs. To support data-driven learning, we establish a dogfight environment based on the Harfang3D sandbox, where we conduct extensive experiments. The results indicate that the proposed framework excels in multistage dogfight, significantly outperforms state-of-the-art reinforcement learning and imitation learning methods. Thanks to the ability of imitating experts and autonomous exploration, our framework can quickly learn the critical knowledge in complex aerial combat tasks, achieving up to a 100% success rate and demonstrating excellent robustness.
Problem

Research questions and friction points this paper is trying to address.

Develops autonomous dogfight policy for UCAVs.
Overcomes weak exploration and low learning efficiency.
Enhances adaptability in dynamic aerial combat environments.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Imitative reinforcement learning for UCAV dogfight
Combines expert data with autonomous exploration
Achieves high success rate in dynamic environments
🔎 Similar Papers
No similar papers found.
S
Siyuan Li
Harbin Institute of Technology, China
R
Rongchang Zuo
Harbin Institute of Technology, China
P
Peng Liu
Harbin Institute of Technology, China
Y
Yingnan Zhao
Harbin Engineering University, China