IMAGINE: Intelligent Multi-Agent Godot-based Indoor Networked Exploration

📅 2026-02-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes a decentralized multi-agent reinforcement learning (MARL) approach to address the challenges of communication constraints, dynamic obstacles, and partial observability in GNSS-denied indoor environments for collaborative multi-drone exploration. Implemented on the high-fidelity Godot simulation platform, the method integrates LiDAR-based perception with local occupancy map sharing and models the problem as a networked distributed POMDP (ND-POMDP) to enable communication-aware cooperative exploration in continuous action spaces. By abandoning conventional reliance on discrete actions, centralized control, prior maps, and persistent connectivity, the approach introduces curriculum learning and a lightweight neural architecture, significantly enhancing training efficiency, robustness, and scalability. This provides a practical and efficient solution for real-world deployment of multi-drone systems.

Technology Category

Application Category

📝 Abstract
The exploration of unknown, Global Navigation Satellite System (GNSS) denied environments by an autonomous communication-aware and collaborative group of Unmanned Aerial Vehicles (UAVs) presents significant challenges in coordination, perception, and decentralized decision-making. This paper implements Multi-Agent Reinforcement Learning (MARL) to address these challenges in a 2D indoor environment, using high-fidelity game-engine simulations (Godot) and continuous action spaces. Policy training aims to achieve emergent collaborative behaviours and decision-making under uncertainty using Network-Distributed Partially Observable Markov Decision Processes (ND-POMDPs). Each UAV is equipped with a Light Detection and Ranging (LiDAR) sensor and can share data (sensor measurements and a local occupancy map) with neighbouring agents. Inter-agent communication constraints include limited range, bandwidth and latency. Extensive ablation studies evaluated MARL training paradigms, reward function, communication system, neural network (NN) architecture, memory mechanisms, and POMDP formulations. This work jointly addresses several key limitations in prior research, namely reliance on discrete actions, single-agent or centralized formulations, assumptions of a priori knowledge and permanent connectivity, inability to handle dynamic obstacles, short planning horizons and architectural complexity in Recurrent NNs/Transformers. Results show that the scalable training paradigm, combined with a simplified architecture, enables rapid autonomous exploration of an indoor area. The implementation of Curriculum-Learning (five increasingly complex levels) also enabled faster, more robust training. This combination of high-fidelity simulation, MARL formulation, and computational efficiency establishes a strong foundation for deploying learned cooperative strategies in physical robotic systems.
Problem

Research questions and friction points this paper is trying to address.

Multi-Agent Reinforcement Learning
GNSS-denied environment
collaborative exploration
decentralized decision-making
communication constraints
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-Agent Reinforcement Learning
ND-POMDP
Godot Simulation
Continuous Action Space
Curriculum Learning
🔎 Similar Papers
No similar papers found.
T
Tiago Leite
INESC INOV, Instituto Superior Técnico, Universidade de Lisboa, Lisbon, 1000-029, Portugal
M
Maria Inês Conceição
INESC INOV, Instituto Superior Técnico, Universidade de Lisboa, Lisbon, 1000-029, Portugal; INESC ID–Instituto de Engenharia de Sistemas e Computadores: Investigação e Desenvolvimento, Instituto Superior Técnico, Lisbon, 1000-029, Portugal; Institute for Systems and Robotics, Instituto Superior Técnico, Universidade de Lisboa, Lisbon, 1049-001, Portugal
António Grilo
António Grilo
INESC-ID, IST, UTL, Lisboa, Portugal
Computer NetworksWireless Communications