🤖 AI Summary
This work addresses the challenge of identifying, with high confidence and sample efficiency, the optimal edge node that minimizes the long-term average Age of Information (AoI) under unknown and time-varying node dynamics. The problem is formulated as a fixed-confidence best-arm identification task in a restless multi-armed bandit (RMAB) setting, where each arm evolves independently according to an unknown Markov chain. The paper presents the first RMAB identification framework tailored for AoI minimization, featuring an age-aware LUCB algorithm. Key contributions include establishing an instance-dependent upper bound on sample complexity that scales with the Markov mixing time, and deriving an information-theoretic lower bound that quantifies how temporal correlations affect identification difficulty. Experimental results demonstrate that the proposed method substantially reduces sampling costs, outperforming existing baselines—particularly in high-confidence regimes.
📝 Abstract
Real-time status updating applications increasingly rely on networks of devices and edge nodes to maintain data freshness, as quantified by the age of information (AoI) metric. Given that edge computing nodes exhibit uncertain and time-varying dynamics, it is essential to identify the optimal edge node with high confidence and sample efficiency, even without prior knowledge of these dynamics, to ensure timely updates. To address this challenge, we introduce the first best arm identification (BAI) problem aimed at minimizing the long-term average AoI under a fixed confidence setting, framed within the context of a restless multi-armed bandit (RMAB) model. In this model, each arm evolves independently according to an unknown Markov chain over time, regardless of whether it is selected. To capture the temporal trajectories of AoI in the presence of unknown restless dynamics, we develop an age-aware LUCB algorithm that incorporates Markovian sampling. Additionally, we establish an instance-dependent upper bound on the sample complexity, which captures the difficulty of the problem as a function of the underlying Markov mixing behavior. Moreover, we derive an information-theoretic lower bound to characterize the fundamental challenges of the problem. We show that the sample complexity is influenced by the temporal correlation of the Markov dynamics, aligning with the intuition offered by the upper bound. Our numerical results show that, compared to existing benchmarks, the proposed scheme significantly reduces sampling costs, particularly under more stringent confidence levels.