Wireless Video Semantic Communication with Decoupled Diffusion Multi-frame Compensation

📅 2025-11-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Traditional wireless video transmission relies on pixel-level coding and neglects semantic redundancy. To address this, this paper proposes WVSC-D, the first semantic communication framework for video transmission incorporating diffusion models. Methodologically: (1) a semantic encoder extracts high-level semantic features; (2) reference semantic frames replace motion vectors to model temporal dependencies; and (3) a decoupled diffusion-based multi-frame compensation (DDMFC) mechanism generates high-fidelity semantic compensation frames via two-stage conditional diffusion. Experiments demonstrate that WVSC-D achieves approximately 1.8 dB PSNR gain over state-of-the-art methods such as DVSC, while significantly reducing bit rate. It thus achieves joint optimization of bandwidth efficiency and semantic fidelity, advancing semantic-aware video communication.

Technology Category

Application Category

📝 Abstract
Existing wireless video transmission schemes directly conduct video coding in pixel level, while neglecting the inner semantics contained in videos. In this paper, we propose a wireless video semantic communication framework with decoupled diffusion multi-frame compensation (DDMFC), abbreviated as WVSC-D, which integrates the idea of semantic communication into wireless video transmission scenarios. WVSC-D first encodes original video frames as semantic frames and then conducts video coding based on such compact representations, enabling the video coding in semantic level rather than pixel level. Moreover, to further reduce the communication overhead, a reference semantic frame is introduced to substitute motion vectors of each frame in common video coding methods. At the receiver, DDMFC is proposed to generate compensated current semantic frame by a two-stage conditional diffusion process. With both the reference frame transmission and DDMFC frame compensation, the bandwidth efficiency improves with satisfying video transmission performance. Experimental results verify the performance gain of WVSC-D over other DL-based methods e.g. DVSC about 1.8 dB in terms of PSNR.
Problem

Research questions and friction points this paper is trying to address.

Wireless video transmission neglects semantic content
Pixel-level coding fails to utilize video semantics
Existing methods have high bandwidth requirements
Innovation

Methods, ideas, or system contributions that make the work stand out.

Semantic-level video coding replaces pixel-level processing
Reference semantic frame substitutes motion vector transmission
Two-stage diffusion process compensates frame loss
🔎 Similar Papers
B
Bingyan Xie
Department of Electronic Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
Y
Yongpeng Wu
Department of Electronic Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
Y
Yuxuan Shi
School of Cyber and Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
Biqian Feng
Biqian Feng
University of Macau
Physical Layer CommunicationInternet of Vehicles
Wenjun Zhang
Wenjun Zhang
City University of Hong Kong
Thin film technologynanomaterials and nanodevices
Jihong Park
Jihong Park
Associate Professor, SUTD, SMIEEE
Wireless CommunicationsSemantic CommunicationDistributed Machine LearningAI-RAN
T
Tony Q. S. Quek
ISTD Pillar, Singapore University of Technology of Design, 8 Somapah Rd, Singapore 487372