Adaptive Pareto-Optimal Token Merging for Edge Transformer Models in Semantic Communication

📅 2025-09-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Transformer-based semantic communication models incur prohibitive computational overhead, hindering their deployment in 6G edge networks. Method: This paper proposes a training-free, adaptive token fusion framework that jointly minimizes inference latency and transmission resource consumption by formulating layer-wise token merging ratios as a Pareto-optimal multi-objective optimization problem. A channel-aware runtime adaptation mechanism—built upon Gaussian process-based Bayesian optimization—dynamically adjusts the merging strength according to real-time signal-to-noise ratio (SNR). The method integrates pre-trained Vision Transformers (ViTs), Token Merging (ToMe), Bayesian optimization, and Pareto frontier search. Contribution/Results: Experiments across diverse SNR conditions demonstrate significant reductions in floating-point operations (FLOPs), achieving inference acceleration while preserving semantic fidelity. The framework enables on-demand accuracy–efficiency trade-offs, validating its effectiveness for resource-constrained 6G edge semantic communication.

Technology Category

Application Category

📝 Abstract
Large-scale transformer models have emerged as a powerful tool for semantic communication systems, enabling edge devices to extract rich representations for robust inference across noisy wireless channels. However, their substantial computational demands remain a major barrier to practical deployment in resource-constrained 6G networks. In this paper, we present a training-free framework for adaptive token merging in pretrained vision transformers to jointly reduce inference time and transmission resource usage. We formulate the selection of per-layer merging proportions as a multi-objective optimization problem to balance accuracy and computational cost. We employ Gaussian process-based Bayesian optimization to construct a Pareto frontier of optimal configurations, enabling flexible runtime adaptation to dynamic application requirements and channel conditions. Extensive experiments demonstrate that our method consistently outperforms other baselines and achieves significant reductions in floating-point operations while maintaining competitive accuracy across a wide range of signal-to-noise ratio (SNR) conditions. Additional results highlight the effectiveness of adaptive policies that adjust merging aggressiveness in response to channel quality, providing a practical mechanism to trade off latency and semantic fidelity on demand. These findings establish a scalable and efficient approach for deploying transformer-based semantic communication in future edge intelligence systems.
Problem

Research questions and friction points this paper is trying to address.

Reducing computational demands of edge transformers
Balancing accuracy and computational cost trade-offs
Adapting token merging to dynamic channel conditions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Training-free token merging framework
Multi-objective Bayesian optimization selection
Adaptive policies for channel conditions
🔎 Similar Papers
O
Omar Erak
KU 6G Research Centre, College of Computing and Mathematical Sciences, Khalifa University, UAE
Omar Alhussein
Omar Alhussein
Khalifa University
Networking and AINetwork OptimizationEdge IntelligenceQuantum Computing
Hatem Abou-Zeid
Hatem Abou-Zeid
Schulich Industry Chair in AI for 6G, University of Calgary
6GArtificial IntelligenceSensingXR NetworkingAI for BCIs
M
Mehdi Bennis
Centre for Wireless Communications, University of Oulu, Finland