Token Communications: A Unified Framework for Cross-modal Context-aware Semantic Communications

📅 2025-02-17
📈 Citations: 0
Influential: 0
📄 PDF

career value

177K/year
🤖 AI Summary
This paper addresses the insufficient cross-modal contextual utilization in generative semantic communication (GenSC) by proposing Token Communications (TokCom), a token-level semantic communication framework. It introduces generative foundation models (GFMs) and multimodal large language models (MLLMs) into semantic communication for the first time, establishing a cross-modal, token-level contextual modeling paradigm. The framework incorporates a Transformer-based architecture supporting token-wise encoding/decoding, cross-modal alignment, generative semantic compression and reconstruction, and semantic-driven wireless resource adaptation. Evaluated on image GenSC tasks, TokCom achieves a 70.8% improvement in bandwidth efficiency while incurring negligible semantic and perceptual quality degradation. These results demonstrate the effectiveness and practicality of cross-modal contextual modeling in GenSC.

Technology Category

Application Category

📝 Abstract
In this paper, we introduce token communications (TokCom), a unified framework to leverage cross-modal context information in generative semantic communications (GenSC). TokCom is a new paradigm, motivated by the recent success of generative foundation models and multimodal large language models (GFM/MLLMs), where the communication units are tokens, enabling efficient transformer-based token processing at the transmitter and receiver. In this paper, we introduce the potential opportunities and challenges of leveraging context in GenSC, explore how to integrate GFM/MLLMs-based token processing into semantic communication systems to leverage cross-modal context effectively, present the key principles for efficient TokCom at various layers in future wireless networks. We demonstrate the corresponding TokCom benefits in a GenSC setup for image, leveraging cross-modal context information, which increases the bandwidth efficiency by 70.8% with negligible loss of semantic/perceptual quality. Finally, the potential research directions are identified to facilitate adoption of TokCom in future wireless networks.
Problem

Research questions and friction points this paper is trying to address.

Cross-modal context-aware semantic communications
Efficient transformer-based token processing
Bandwidth efficiency in generative semantic communications
Innovation

Methods, ideas, or system contributions that make the work stand out.

Token-based cross-modal communication
Transformer processing for semantic tokens
Multimodal context enhances bandwidth efficiency