TLE-Based A2C Agent for Terrestrial Coverage Orbital Path Planning

📅 2025-08-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Increasing congestion in low Earth orbit (LEO) exacerbates mission planning complexity and collision risk for Earth observation satellites. This paper proposes a reinforcement learning–based orbital path planning method that formulates five-dimensional orbital element control as a Markov decision process. An Advantage Actor–Critic (A2C) agent is trained within a physics-informed Gymnasium simulation environment integrating Keplerian dynamics and Two-Line Element (TLE) data. Compared to a proximal policy optimization (PPO) baseline, the proposed approach achieves a 5.8× improvement in cumulative reward and converges 31.5× faster—attaining stable policy learning within only 2,000 training steps. Experimental results demonstrate that the Actor–Critic framework significantly outperforms trust-region methods in continuous orbital control, enabling high-precision Earth coverage and adaptive, safe, and efficient satellite deployment. The method provides a scalable solution for intelligent mission planning in densely populated LEO environments.

Technology Category

Application Category

📝 Abstract
The increasing congestion of Low Earth Orbit (LEO) poses persistent challenges to the efficient deployment and safe operation of Earth observation satellites. Mission planners must now account not only for mission-specific requirements but also for the increasing collision risk with active satellites and space debris. This work presents a reinforcement learning framework using the Advantage Actor-Critic (A2C) algorithm to optimize satellite orbital parameters for precise terrestrial coverage within predefined surface radii. By formulating the problem as a Markov Decision Process (MDP) within a custom OpenAI Gymnasium environment, our method simulates orbital dynamics using classical Keplerian elements. The agent progressively learns to adjust five of the orbital parameters - semi-major axis, eccentricity, inclination, right ascension of ascending node, and the argument of perigee-to achieve targeted terrestrial coverage. Comparative evaluation against Proximal Policy Optimization (PPO) demonstrates A2C's superior performance, achieving 5.8x higher cumulative rewards (10.0 vs 9.263025) while converging in 31.5x fewer timesteps (2,000 vs 63,000). The A2C agent consistently meets mission objectives across diverse target coordinates while maintaining computational efficiency suitable for real-time mission planning applications. Key contributions include: (1) a TLE-based orbital simulation environment incorporating physics constraints, (2) validation of actor-critic methods' superiority over trust region approaches in continuous orbital control, and (3) demonstration of rapid convergence enabling adaptive satellite deployment. This approach establishes reinforcement learning as a computationally efficient alternative for scalable and intelligent LEO mission planning.
Problem

Research questions and friction points this paper is trying to address.

Optimize satellite orbits for precise terrestrial coverage
Reduce collision risks in congested Low Earth Orbit
Improve computational efficiency for real-time mission planning
Innovation

Methods, ideas, or system contributions that make the work stand out.

A2C algorithm optimizes satellite orbital parameters
MDP formulation in custom OpenAI Gymnasium environment
TLE-based orbital simulation with physics constraints
🔎 Similar Papers
No similar papers found.