Wireless Video Semantic Communication with Decoupled Diffusion Multi-frame Compensation

📅 2025-11-04

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Traditional wireless video transmission relies on pixel-level coding and neglects semantic redundancy. To address this, this paper proposes WVSC-D, the first semantic communication framework for video transmission incorporating diffusion models. Methodologically: (1) a semantic encoder extracts high-level semantic features; (2) reference semantic frames replace motion vectors to model temporal dependencies; and (3) a decoupled diffusion-based multi-frame compensation (DDMFC) mechanism generates high-fidelity semantic compensation frames via two-stage conditional diffusion. Experiments demonstrate that WVSC-D achieves approximately 1.8 dB PSNR gain over state-of-the-art methods such as DVSC, while significantly reducing bit rate. It thus achieves joint optimization of bandwidth efficiency and semantic fidelity, advancing semantic-aware video communication.

Technology Category

Application Category

📝 Abstract

Existing wireless video transmission schemes directly conduct video coding in pixel level, while neglecting the inner semantics contained in videos. In this paper, we propose a wireless video semantic communication framework with decoupled diffusion multi-frame compensation (DDMFC), abbreviated as WVSC-D, which integrates the idea of semantic communication into wireless video transmission scenarios. WVSC-D first encodes original video frames as semantic frames and then conducts video coding based on such compact representations, enabling the video coding in semantic level rather than pixel level. Moreover, to further reduce the communication overhead, a reference semantic frame is introduced to substitute motion vectors of each frame in common video coding methods. At the receiver, DDMFC is proposed to generate compensated current semantic frame by a two-stage conditional diffusion process. With both the reference frame transmission and DDMFC frame compensation, the bandwidth efficiency improves with satisfying video transmission performance. Experimental results verify the performance gain of WVSC-D over other DL-based methods e.g. DVSC about 1.8 dB in terms of PSNR.

Problem

Research questions and friction points this paper is trying to address.

Wireless video transmission neglects semantic content

Pixel-level coding fails to utilize video semantics

Existing methods have high bandwidth requirements

Innovation

Methods, ideas, or system contributions that make the work stand out.

Semantic-level video coding replaces pixel-level processing

Reference semantic frame substitutes motion vector transmission

Two-stage diffusion process compensates frame loss

🔎 Similar Papers

VideoQA-SC: Adaptive Semantic Communication for Video Question Answering

2024-05-17arXiv.orgCitations: 1

Authors to Follow