RF-GPT: Teaching AI to See the Wireless World

📅 2026-02-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the gap between wireless signal perception and high-level semantic reasoning by proposing the first radio frequency (RF) language model that natively integrates RF signals into a multimodal large language model framework. By converting in-phase/quadrature (IQ) waveforms into time-frequency spectrograms, the approach leverages a pretrained vision encoder to extract meaningful features, which are then injected into the language decoder as RF tokens, enabling end-to-end trainable RF semantic understanding without requiring manual annotations. The model demonstrates superior performance over general-purpose vision-language models across multiple tasks—including wideband modulation classification, wireless technology identification, and WLAN user counting—thereby validating its effectiveness and strong generalization capability in RF-aware multimodal reasoning.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) and multimodal models have become powerful general-purpose reasoning systems. However, radio-frequency (RF) signals, which underpin wireless systems, are still not natively supported by these models. Existing LLM-based approaches for telecom focus mainly on text and structured data, while conventional RF deep-learning models are built separately for specific signal-processing tasks, highlighting a clear gap between RF perception and high-level reasoning. To bridge this gap, we introduce RF-GPT, a radio-frequency language model (RFLM) that utilizes the visual encoders of multimodal LLMs to process and understand RF spectrograms. In this framework, complex in-phase/quadrature (IQ) waveforms are mapped to time-frequency spectrograms and then passed to pretrained visual encoders. The resulting representations are injected as RF tokens into a decoder-only LLM, which generates RF-grounded answers, explanations, and structured outputs. To train RF-GPT, we perform supervised instruction fine-tuning of a pretrained multimodal LLM using a fully synthetic RF corpus. Standards-compliant waveform generators produce wideband scenes for six wireless technologies, from which we derive time-frequency spectrograms, exact configuration metadata, and dense captions. A text-only LLM then converts these captions into RF-grounded instruction-answer pairs, yielding roughly 12,000 RF scenes and 0.625 million instruction examples without any manual labeling. Across benchmarks for wideband modulation classification, overlap analysis, wireless-technology recognition, WLAN user counting, and 5G NR information extraction, RF-GPT achieves strong multi-task performance, whereas general-purpose VLMs with no RF grounding largely fail.
Problem

Research questions and friction points this paper is trying to address.

radio-frequency signals
large language models
multimodal models
RF perception
wireless systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

RF-GPT
radio-frequency language model
multimodal LLM
spectrogram encoding
synthetic RF dataset
🔎 Similar Papers
No similar papers found.
H
Hang Zou
Research Institute for Digital Future, Khalifa University, 127788 Abu Dhabi, UAE
Y
Yu Tian
Research Institute for Digital Future, Khalifa University, 127788 Abu Dhabi, UAE
Bohao Wang
Bohao Wang
College of Information Science & Electronic Engineering, Zhejiang University
Wireless AICommunication6GDigital TwinRay Tracing
Lina Bariah
Lina Bariah
Lead AI Scientist | Adjunct Professor at Khalifa University
6G NetworksArtificial IntelligenceMachine LearningLarge Telecom ModelsGenerative AI
Samson Lasaulce
Samson Lasaulce
CNRS Director of Research
Game theoryOptimizationLearningNetworks
C
Chongwen Huang
College of Information Science and Electronic Engineering, Zhejiang University, 310027, Hangzhou, China
M
Mérouane Debbah
Research Institute for Digital Future, Khalifa University, 127788 Abu Dhabi, UAE