🤖 AI Summary
To ensure service continuity for edge computing in low-Earth-orbit (LEO) satellite networks under dynamic topologies, this paper proposes a spatiotemporal graph reinforcement learning framework that jointly optimizes service migration and resource allocation. First, we formulate a spatiotemporal graph Markov decision process (STG-MDP) to model dynamic spatial dependencies and temporal evolution between satellites and users. Second, we design a Graph-Aware Temporal Encoder (GATE), integrating hierarchical graph convolutional networks with temporal convolutions to achieve integrated space–ground representation learning. Third, we introduce a multi-head hybrid proximal policy optimization (MH-PPO) algorithm to concurrently determine discrete migration decisions and continuous resource allocation. Extensive simulations across diverse scenarios demonstrate that our approach significantly reduces service interruption rate and packet loss rate, while improving resource utilization and global quality of service. The framework establishes a scalable, joint optimization paradigm for satellite-mounted edge intelligence.
📝 Abstract
The rapid expansion of latency-sensitive applications has sparked renewed interest in deploying edge computing capabilities aboard satellite constellations, aiming to achieve truly global and seamless service coverage. On one hand, it is essential to allocate the limited onboard computational and communication resources efficiently to serve geographically distributed users. On the other hand, the dynamic nature of satellite orbits necessitates effective service migration strategies to maintain service continuity and quality as the coverage areas of satellites evolve. We formulate this problem as a spatio-temporal Markov decision process, where satellites, ground users, and flight users are modeled as nodes in a time-varying graph. The node features incorporate queuing dynamics to characterize packet loss probabilities. To solve this problem, we propose a Graph-Aware Temporal Encoder (GATE) that jointly models spatial correlations and temporal dynamics. GATE uses a two-layer graph convolutional network to extract inter-satellite and user dependencies and a temporal convolutional network to capture their short-term evolution, producing unified spatio-temporal representations. The resulting spatial-temporal representations are passed into a Hybrid Proximal Policy Optimization (HPPO) framework. This framework features a multi-head actor that outputs both discrete service migration decisions and continuous resource allocation ratios, along with a critic for value estimation. We conduct extensive simulations involving both persistent and intermittent users distributed across real-world population centers.