🤖 AI Summary
This work addresses the high computational complexity of Orthogonal Time Frequency Space (OTFS) modulation in delay–Doppler domain processing, which hinders real-time implementation despite its robustness in high-mobility scenarios. The authors propose a hardware–algorithm co-designed GPU-accelerated Zak-OTFS receiver that exploits the structured sparsity of delay–Doppler channels. By integrating compact matrix operations, branchless iterative equalization, and a compute-aware architecture, the design substantially reduces both computational and memory overheads. The implementation achieves a throughput of 906.52 Mbps on a 16384×32 grid with 16QAM and 245.76 MHz bandwidth, meeting real-time processing deadlines at the 99.9th percentile latency. Extensive evaluations across multiple platforms demonstrate excellent scalability and robustness.
📝 Abstract
Orthogonal time frequency space (OTFS) modulation offers superior robustness to high-mobility channels compared to conventional orthogonal frequency-division multiplexing (OFDM) waveforms. However, its explicit delay-Doppler (DD) domain representation incurs substantial signal processing complexity, especially with increased DD domain grid sizes. To address this challenge, we present a scalable, real-time Zak-OTFS receiver architecture on GPUs through hardware--algorithm co-design that exploits DD-domain channel sparsity. Our design leverages compact matrix operations for key processing stages, a branchless iterative equalizer, and a structured sparse channel matrix of the DD domain channel matrix to significantly reduce computational and memory overhead. These optimizations enable low-latency processing that consistently meets the 99.9-th percentile real-time processing deadline. The proposed system achieves up to 906.52 Mbps throughput with a DD grid size of (16384,32) using 16QAM modulation over 245.76 MHz bandwidth. Extensive evaluations under a Vehicular-A channel model demonstrate strong scalability and robust performance across CPU (Intel Xeon) and multiple GPU platforms (NVIDIA Jetson Orin, RTX 6000 Ada, A100, and H200), highlighting the effectiveness of compute-aware Zak-OTFS receiver design for next-generation (NextG) high-mobility communication systems.