🤖 AI Summary
Traditional frequency-hopping anti-jamming methods exhibit poor robustness under dynamic, unknown interference, while existing deep reinforcement learning (DRL) approaches suffer from slow training convergence. Method: This paper proposes a fast adaptive channel access method designed to “learn faster than the jammer.” It innovatively incorporates coarse-grained spectrum prediction as an auxiliary task within a Deep Q-Network (DQN) framework, establishing a multi-task co-training mechanism that guides the Q-function to learn interference evolution patterns more efficiently. Contribution/Results: The method significantly improves policy responsiveness and training efficiency—reducing training episodes by 70% and increasing throughput by 10% over Nash equilibrium strategies. It is the first work to achieve end-to-end joint optimization of spectrum prediction and DRL, establishing a new low-latency, high-adaptivity paradigm for dynamic adversarial wireless access.
📝 Abstract
This paper investigates the anti-jamming channel access problem in complex and unknown jamming environments, where the jammer could dynamically adjust its strategies to target different channels. Traditional channel hopping anti-jamming approaches using fixed patterns are ineffective against such dynamic jamming attacks. Although the emerging deep reinforcement learning (DRL) based dynamic channel access approach could achieve the Nash equilibrium under fast-changing jamming attacks, it requires extensive training episodes. To address this issue, we propose a fast adaptive anti-jamming channel access approach guided by the intuition of ``learning faster than the jammer", where a synchronously updated coarse-grained spectrum prediction serves as an auxiliary task for the deep Q learning (DQN) based anti-jamming model. This helps the model identify a superior Q-function compared to standard DRL while significantly reducing the number of training episodes. Numerical results indicate that the proposed approach significantly accelerates the rate of convergence in model training, reducing the required training episodes by up to 70% compared to standard DRL. Additionally, it also achieves a 10% improvement in throughput over NE strategies, owing to the effective use of coarse-grained spectrum prediction.