🤖 AI Summary
To address low synchronization accuracy and resource constraints in dynamic wireless factory networks, this paper formulates the digital twin synchronization process as a Constrained Markov Decision Process (CMDP) for the first time. We propose a joint device selection and radio resource block (RB) scheduling framework based on Continual Reinforcement Learning (CRL), enabling cross-temporal experience reuse and rapid environmental adaptation. Hard constraints—particularly real-time feasibility—are handled via Lagrangian duality transformation. Experimental results demonstrate that, under identical RB availability, the proposed method reduces the normalized root mean square error (NRMSE) by up to 55.2%, significantly improving both synchronization fidelity and network adaptability to dynamic conditions. This work establishes a novel paradigm for high-fidelity digital twins in resource-constrained industrial environments.
📝 Abstract
This article investigates the adaptive resource allocation scheme for digital twin (DT) synchronization optimization over dynamic wireless networks. In our considered model, a base station (BS) continuously collects factory physical object state data from wireless devices to build a real-time virtual DT system for factory event analysis. Due to continuous data transmission, maintaining DT synchronization must use extensive wireless resources. To address this issue, a subset of devices is selected to transmit their sensing data, and resource block (RB) allocation is optimized. This problem is formulated as a constrained Markov process (CMDP) problem that minimizes the long-term mismatch between the physical and virtual systems. To solve this CMDP, we first transform the problem into a dual problem that refines RB constraint impacts on device scheduling strategies. We then propose a continual reinforcement learning (CRL) algorithm to solve the dual problem. The CRL algorithm learns a stable policy across historical experiences for quick adaptation to dynamics in physical states and network capacity. Simulation results show that the CRL can adapt quickly to network capacity changes and reduce normalized root mean square error (NRMSE) between physical and virtual states by up to 55.2%, using the same RB number as traditional methods.