🤖 AI Summary
This paper addresses the NP-hard, decentralized optimization problem of autonomous task allocation for drone swarms in large-scale dynamic spatiotemporal environments. We propose a long-short-term collaborative decision-making framework: the long-term layer employs distributed deep reinforcement learning (DRL) to jointly optimize flight and charging policies, while the short-term layer leverages decentralized collective learning for real-time perception–navigation coordination. We introduce the first two-layer decoupled architecture that synergistically integrates DRL’s long-horizon adaptability with collective learning’s low latency and strong privacy preservation—thereby overcoming the fundamental trade-off among scalability, privacy protection, and long-term robustness. Evaluated on a real-world urban traffic dataset, our method improves task completion rate by 27.83% over state-of-the-art collective learning baselines and by 23.17% over advanced DRL baselines, while significantly enhancing energy efficiency, monitoring accuracy, and operational sustainability.
📝 Abstract
This paper addresses the problem of autonomous task allocation by a swarm of autonomous, interactive drones in large-scale, dynamic spatio-temporal environments. When each drone independently determines navigation, sensing, and recharging options to choose from such that system-wide sensing requirements are met, the collective decision-making becomes an NP-hard decentralized combinatorial optimization problem. Existing solutions face significant limitations: distributed optimization methods such as collective learning often lack long-term adaptability, while centralized deep reinforcement learning (DRL) suffers from high computational complexity, scalability and privacy concerns. To overcome these challenges, we propose a novel hybrid optimization approach that combines long-term DRL with short-term collective learning. In this approach, each drone uses DRL methods to proactively determine high-level strategies, such as flight direction and recharging behavior, while leveraging collective learning to coordinate short-term sensing and navigation tasks with other drones in a decentralized manner. Extensive experiments using datasets derived from realistic urban mobility demonstrate that the proposed solution outperforms standalone state-of-the-art collective learning and DRL approaches by $27.83%$ and $23.17%$ respectively. Our findings highlight the complementary strengths of short-term and long-term decision-making, enabling energy-efficient, accurate, and sustainable traffic monitoring through swarms of drones.