🤖 AI Summary
Industrial process control in data-scarce, dynamically complex, and highly uncertain domains—such as biopharmaceutical manufacturing—remains challenging due to limited observational data and intrinsic system nonlinearity.
Method: This paper proposes an Actor-Simulator dual-component model-based reinforcement learning framework that jointly optimizes digital twin calibration and optimal control policy design—a first-of-its-kind integration. The approach synergistically combines Bayesian system identification, stochastic nonlinear dynamical modeling, and an adaptive exploration-exploitation mechanism, where control policy performance actively guides efficient data acquisition. Convergence is theoretically guaranteed under mild assumptions.
Results: Evaluated on a biopharmaceutical simulation benchmark, the method significantly reduces model prediction error, enhances closed-loop robustness, and improves sample efficiency. It achieves comprehensive performance superiority over state-of-the-art baselines across all key metrics.
📝 Abstract
This paper presents a novel methodological framework, called the Actor-Simulator, that incorporates the calibration of digital twins into model-based reinforcement learning for more effective control of stochastic systems with complex nonlinear dynamics. Traditional model-based control often relies on restrictive structural assumptions (such as linear state transitions) and fails to account for parameter uncertainty in the model. These issues become particularly critical in industries such as biopharmaceutical manufacturing, where process dynamics are complex and not fully known, and only a limited amount of data is available. Our approach jointly calibrates the digital twin and searches for an optimal control policy, thus accounting for and reducing model error. We balance exploration and exploitation by using policy performance as a guide for data collection. This dual-component approach provably converges to the optimal policy, and outperforms existing methods in extensive numerical experiments based on the biopharmaceutical manufacturing domain.