🤖 AI Summary
Static architectural approaches exhibit insufficient adaptability in Systems of Systems (SoS) due to highly uncertain and dynamically evolving mission environments. Method: This paper proposes a novel mission engineering paradigm integrating digital engineering with deep reinforcement learning (DRL). It constructs a high-fidelity digital mission model, formalizes tactical mission management as a Markov Decision Process (MDP), and trains adaptive mission coordination policies via the Proximal Policy Optimization (PPO) algorithm within an agent-based simulation sandbox. Contribution/Results: The work pioneers deep coupling between digital engineering and DRL, enabling mission-agnostic online task allocation and dynamic system reconfiguration. Evaluated on an aerial wildfire suppression case study, the framework significantly improves mission completion stability and reduces performance volatility—overcoming the adaptability bottleneck of conventional static architectures in dynamic operational settings.
📝 Abstract
As systems engineering (SE) objectives evolve from design and operation of monolithic systems to complex System of Systems (SoS), the discipline of Mission Engineering (ME) has emerged which is increasingly being accepted as a new line of thinking for the SE community. Moreover, mission environments are uncertain, dynamic, and mission outcomes are a direct function of how the mission assets will interact with this environment. This proves static architectures brittle and calls for analytically rigorous approaches for ME. To that end, this paper proposes an intelligent mission coordination methodology that integrates digital mission models with Reinforcement Learning (RL), that specifically addresses the need for adaptive task allocation and reconfiguration. More specifically, we are leveraging a Digital Engineering (DE) based infrastructure that is composed of a high-fidelity digital mission model and agent-based simulation; and then we formulate the mission tactics management problem as a Markov Decision Process (MDP), and employ an RL agent trained via Proximal Policy Optimization. By leveraging the simulation as a sandbox, we map the system states to actions, refining the policy based on realized mission outcomes. The utility of the RL-based intelligent mission coordinator is demonstrated through an aerial firefighting case study. Our findings indicate that the RL-based intelligent mission coordinator not only surpasses baseline performance but also significantly reduces the variability in mission performance. Thus, this study serves as a proof of concept demonstrating that DE-enabled mission simulations combined with advanced analytical tools offer a mission-agnostic framework for improving ME practice; which can be extended to more complicated fleet design and selection problems in the future from a mission-first perspective.