Token Communications: A Unified Framework for Cross-modal Context-aware Semantic Communications

📅 2025-02-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the insufficient cross-modal contextual utilization in generative semantic communication (GenSC) by proposing Token Communications (TokCom), a token-level semantic communication framework. It introduces generative foundation models (GFMs) and multimodal large language models (MLLMs) into semantic communication for the first time, establishing a cross-modal, token-level contextual modeling paradigm. The framework incorporates a Transformer-based architecture supporting token-wise encoding/decoding, cross-modal alignment, generative semantic compression and reconstruction, and semantic-driven wireless resource adaptation. Evaluated on image GenSC tasks, TokCom achieves a 70.8% improvement in bandwidth efficiency while incurring negligible semantic and perceptual quality degradation. These results demonstrate the effectiveness and practicality of cross-modal contextual modeling in GenSC.

Technology Category

Application Category

📝 Abstract
In this paper, we introduce token communications (TokCom), a unified framework to leverage cross-modal context information in generative semantic communications (GenSC). TokCom is a new paradigm, motivated by the recent success of generative foundation models and multimodal large language models (GFM/MLLMs), where the communication units are tokens, enabling efficient transformer-based token processing at the transmitter and receiver. In this paper, we introduce the potential opportunities and challenges of leveraging context in GenSC, explore how to integrate GFM/MLLMs-based token processing into semantic communication systems to leverage cross-modal context effectively, present the key principles for efficient TokCom at various layers in future wireless networks. We demonstrate the corresponding TokCom benefits in a GenSC setup for image, leveraging cross-modal context information, which increases the bandwidth efficiency by 70.8% with negligible loss of semantic/perceptual quality. Finally, the potential research directions are identified to facilitate adoption of TokCom in future wireless networks.
Problem

Research questions and friction points this paper is trying to address.

Cross-modal context-aware semantic communications
Efficient transformer-based token processing
Bandwidth efficiency in generative semantic communications
Innovation

Methods, ideas, or system contributions that make the work stand out.

Token-based cross-modal communication
Transformer processing for semantic tokens
Multimodal context enhances bandwidth efficiency
🔎 Similar Papers
No similar papers found.
Li Qiao
Li Qiao
Beijing Institute of Technology
Wireless Communications,Signal Processing,Machine Learning
Mahdi Boloursaz Mashhadi
Mahdi Boloursaz Mashhadi
Lecturer (Assistant Professor) at University of Surrey
Wireless CommunicationsSignal ProcessingMachine Learning
Zhen Gao
Zhen Gao
Beijing Institute of Technology
Generative AI6GMIMO communicationsIoT edge computingLarge Model
R
Rahim Tafazolli
5GIC & 6GIC, Institute for Communication Systems (ICS), University of Surrey, Guildford, United Kingdom
M
Mehdi Bennis
Centre for Wireless Communications, University of Oulu, 90014 Oulu, Finland
D
Dusit Niyato
School of Computer Science and Engineering, Nanyang Technological University, Singapore 639798