KISS: Keeping it Simple and Slotted when Learning to Communicate over Wireless

📅 2026-05-29

📈 Citations: 0

✨ Influential: 0

career value

292K/year

🤖 AI Summary

This study addresses the challenge of achieving efficient and fair random channel access in distributed wireless systems without relying on predefined protocols. The authors propose a fully online, fully decentralized multi-agent reinforcement learning approach, wherein agents autonomously learn access policies through an off-policy Double DQN algorithm grounded in Bayesian inference. Operating under minimal assumptions—no pretraining, no coordination, and no explicit inter-agent communication—the method enables each agent to adapt its strategy to diverse network conditions. The learned policies ensure fairness among users while attaining channel utilization close to the theoretical optimum. Remarkably, the emergent behavior closely resembles that of dynamic-slot ALOHA, demonstrating for the first time in a decentralized setting near-optimal random access performance approaching fundamental theoretical limits.

📝 Abstract

A long-standing challenge in distributed wireless systems is ensuring efficient and fair random channel access. Existing solutions often address specific constraints related to timing, periodicity, or centralization, but they typically rely on fixed heuristics. Motivated by recent advances in machine learning (ML), we investigate whether ML agents can autonomously learn efficient and fair access strategies, and whether such learning can offer new insights into medium access control (MAC) design. Rather than proposing a deployable protocol, our aim is to examine whether decentralized learning can rediscover or approximate theoretically efficient random-access mechanisms under minimal assumptions. To this end, we deploy an off-policy Double Deep Q-Network (DDQN) with Bayesian inference to train agents operating over a slotted channel. The resulting method is fully online (no pre-training), fully distributed (independent multi-agent learners), stochastic (non-periodic), and requires no coordination or explicit communication. Extensive simulations show that the learned strategy adapts to varying network conditions and achieves near-theoretical efficiency while maintaining fairness. Ablation studies further reveal that the learned behavior resembles slotted ALOHA with a dynamically adjusted transmission probability, leading us to refer to the method as KISS: Keeping It Simple and Slotted.

Problem

Research questions and friction points this paper is trying to address.

random channel access

medium access control

distributed wireless systems

fairness

efficiency

Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-agent reinforcement learning

slotted ALOHA

decentralized MAC