🤖 AI Summary
To address insufficient modeling of human panic behavior and poor real-time performance of evacuation guidance in partially observable urban fire environments, this paper proposes a psychology-informed heterogeneous UAV coordination framework for emergency evacuation. Methodologically, we formulate a human-UAV collaborative decision-making model based on Partially Observable Markov Decision Processes (POMDPs) and enhance multi-agent Proximal Policy Optimization (PPO) with recurrent neural networks to enable dynamic victim localization and online path re-planning under stochastic fire propagation and severely degraded visibility. Crucially, we integrate a bi-level panic behavior model that explicitly differentiates high-level (situational awareness) and low-level (proximal guidance) UAV roles. Simulation results demonstrate significant improvements: average evacuation time is reduced by 37.2%, and successful interception rate increases by 51.6%, validating the effectiveness of psychological mechanism–driven autonomous coordination in extreme human–machine coexistence scenarios.
📝 Abstract
Autonomous drone technology holds significant promise for enhancing search and rescue operations during evacuations by guiding humans toward safety and supporting broader emergency response efforts. However, their application in dynamic, real-time evacuation support remains limited. Existing models often overlook the psychological and emotional complexity of human behavior under extreme stress. In real-world fire scenarios, evacuees frequently deviate from designated safe routes due to panic and uncertainty. To address these challenges, this paper presents a multi-agent coordination framework in which autonomous Unmanned Aerial Vehicles (UAVs) assist human evacuees in real-time by locating, intercepting, and guiding them to safety under uncertain conditions. We model the problem as a Partially Observable Markov Decision Process (POMDP), where two heterogeneous UAV agents, a high-level rescuer (HLR) and a low-level rescuer (LLR), coordinate through shared observations and complementary capabilities. Human behavior is captured using an agent-based model grounded in empirical psychology, where panic dynamically affects decision-making and movement in response to environmental stimuli. The environment features stochastic fire spread, unknown evacuee locations, and limited visibility, requiring UAVs to plan over long horizons to search for humans and adapt in real-time. Our framework employs the Proximal Policy Optimization (PPO) algorithm with recurrent policies to enable robust decision-making in partially observable settings. Simulation results demonstrate that the UAV team can rapidly locate and intercept evacuees, significantly reducing the time required for them to reach safety compared to scenarios without UAV assistance.