🤖 AI Summary
This work addresses spacecraft trajectory optimization under unknown initial state and process noise distributions by proposing a general robust optimization framework that does not rely on assumptions about uncertainty distributions. The approach first generates a deterministic nominal trajectory offline, then constructs an affine closed-loop correction law comprising feedforward and time-varying feedback gains via chance-constrained reinforcement learning. It employs rolling sampling to estimate the upper-tail quantiles of probabilistic constraints and incorporates a covariance feasibility penalty to regulate terminal dispersion. Evaluated on three-dimensional Earth-to-Mars multi-impulse transfers and continuous-thrust precision landing scenarios, the method achieves competitive fuel performance while guaranteeing probabilistic feasibility, demonstrating strong cross-task generalization and robustness.
📝 Abstract
This paper presents a distribution-agnostic robust trajectory-optimization framework based on chance-constrained reinforcement learning. The uncertainty is represented here through initial conditions and process noise, with the only requirement being that it can be sampled. A deterministic nominal trajectory is first computed offline, and reinforcement learning is then used only to robustify that baseline through a structured affine closed-loop correction law comprising a feedforward control adjustment and time-varying feedback gains. Probabilistic feasibility is enforced empirically through rollout-based upper-tail quantiles, while terminal dispersion is regulated through covariance-feasibility penalties. The framework is assessed on two materially different trajectory design problems. The flagship case study is a three-dimensional multi-impulse Earth-Mars transfer, where the learned policy is benchmarked against a recent robust trajectory-optimization reference under Gaussian uncertainty and then evaluated under bounded uniform uncertainty and under process disturbances not seen during training. The second case study is a stochastic atmospheric pinpoint rocket landing problem, used to assess portability to a short-horizon continuous-thrust setting with drag, mass depletion, and glide-slope constraints. The results show that the proposed framework can remain competitive in upper-tail fuel cost while preserving probabilistic feasibility, and that the same robustification scaffold can be carried across heterogeneous spacecraft trajectory planning problems without redesign of its core stochastic-control structure.