🤖 AI Summary
This work addresses the limited generalization of existing data-driven approaches in dynamic wireless environments, which stems from their failure to model the underlying physics of electromagnetic wave propagation. To overcome this, we propose WWM, a multimodal foundation framework that introduces, for the first time, a physics-aware world model into 6G research. By integrating channel state information, 3D point clouds, and user trajectories through a joint embedding predictive architecture and a multimodal mixture-of-experts Transformer, WWM explicitly captures the causal relationships between 3D geometry and signal dynamics. This enables accurate spatiotemporal prediction of wireless channel evolution. Evaluated on five downstream tasks, WWM consistently outperforms both state-of-the-art single-modality foundation models and specialized architectures, demonstrating its capacity to internalize physical principles and generalize across diverse real-world scenarios, as validated on actual measurement data.
📝 Abstract
Integrating AI into the physical layer is a cornerstone of 6G networks. However, current data-driven approaches struggle to generalize across dynamic environments because they lack an intrinsic understanding of electromagnetic wave propagation. We introduce the Wireless World Model (WWM), a multi-modal foundation framework predicting the spatiotemporal evolution of wireless channels by internalizing the causal relationship between 3D geometry and signal dynamics. Pre-trained on a massive ray-traced multi-modal dataset, WWM overcomes the data authenticity gap, further validated under real-world measurement data. Using a joint-embedding predictive architecture with a multi-modal mixture-of-experts Transformer, WWM fuses channel state information, 3D point clouds, and user trajectories into a unified representation. Across the five key downstream tasks supported by WWM, it achieves remarkable performance in seen environments, unseen generalization scenarios, and real-world measurements, consistently outperforming SOTA uni-modal foundation models and task-specific models. This paves the way for physics-aware 6G intelligence that adapts to the physical world.