🤖 AI Summary
To address real-time operation, partial observability, energy/storage constraints, and agent heterogeneity in autonomous Earth observation by optical/SAR heterogeneous satellite constellations, this paper proposes a multi-agent reinforcement learning (MARL)-based collaborative resource optimization framework. The method innovatively integrates MAPPO, HAPPO, and HATRPO to jointly model cross-payload task allocation and dynamic resource balancing, effectively mitigating non-stationarity and reward coupling. Evaluated on the high-fidelity Basilisk-BSK-RL simulation platform, the framework achieves a 23.6% improvement in imaging coverage and a 31.2% gain in energy efficiency. It enables scalable, adaptive constellation-level autonomous decision-making under operational constraints. The approach establishes a deployable technical paradigm for intelligent Earth observation mission planning, advancing the state of autonomous space-based remote sensing.
📝 Abstract
This work investigates resource optimization in heterogeneous satellite clusters performing autonomous Earth Observation (EO) missions using Reinforcement Learning (RL). In the proposed setting, two optical satellites and one Synthetic Aperture Radar (SAR) satellite operate cooperatively in low Earth orbit to capture ground targets and manage their limited onboard resources efficiently. Traditional optimization methods struggle to handle the real-time, uncertain, and decentralized nature of EO operations, motivating the use of RL and Multi-Agent Reinforcement Learning (MARL) for adaptive decision-making. This study systematically formulates the optimization problem from single-satellite to multi-satellite scenarios, addressing key challenges including energy and memory constraints, partial observability, and agent heterogeneity arising from diverse payload capabilities. Using a near-realistic simulation environment built on the Basilisk and BSK-RL frameworks, we evaluate the performance and stability of state-of-the-art MARL algorithms such as MAPPO, HAPPO, and HATRPO. Results show that MARL enables effective coordination across heterogeneous satellites, balancing imaging performance and resource utilization while mitigating non-stationarity and inter-agent reward coupling. The findings provide practical insights into scalable, autonomous satellite operations and contribute a foundation for future research on intelligent EO mission planning under heterogeneous and dynamic conditions.