🤖 AI Summary
To address the real-time trade-off between image reconstruction fidelity and semantic classification accuracy in dynamic wireless environments for 6G semantic communications, this paper proposes a task-oriented adaptive semantic transmission framework. Methodologically: (1) a deep reinforcement learning–based task trade-off decision mechanism is designed for channel-aware dynamic optimization; (2) a modular LoRA-adapted Swin Transformer architecture is developed for joint source-channel coding; and (3) a latent-space diffusion model is integrated to enhance feature recovery under noise. Experiments demonstrate that the framework maintains robustness across diverse channel impairments—including low SNR, AWGN, fading, phase noise, and impulsive interference—achieving significantly higher classification accuracy and reconstruction quality than state-of-the-art baselines. Moreover, it reduces adaptive overhead while preserving semantic integrity and perceptual fidelity.
📝 Abstract
The evolution toward 6G networks demands a fundamental shift from bit-centric transmission to semantic-aware communication that emphasizes task-relevant information. This work introduces TOAST (Task-Oriented Adaptive Semantic Transmission), a unified framework designed to address the core challenge of multi-task optimization in dynamic wireless environments through three complementary components. First, we formulate adaptive task balancing as a Markov decision process, employing deep reinforcement learning to dynamically adjust the trade-off between image reconstruction fidelity and semantic classification accuracy based on real-time channel conditions. Second, we integrate module-specific Low-Rank Adaptation (LoRA) mechanisms throughout our Swin Transformer-based joint source-channel coding architecture, enabling parameter-efficient fine-tuning that dramatically reduces adaptation overhead while maintaining full performance across diverse channel impairments including Additive White Gaussian Noise (AWGN), fading, phase noise, and impulse interference. Third, we incorporate an Elucidating diffusion model that operates in the latent space to restore features corrupted by channel noises, providing substantial quality improvements compared to baseline approaches. Extensive experiments across multiple datasets demonstrate that TOAST achieves superior performance compared to baseline approaches, with significant improvements in both classification accuracy and reconstruction quality at low Signal-to-Noise Ratio (SNR) conditions while maintaining robust performance across all tested scenarios.