🤖 AI Summary
This work addresses key challenges in visual semantic communication, including imprecise semantic quantization, weak reconstruction robustness, difficulty in transmitter–receiver coordination, and poor channel adaptability. It presents a systematic survey of SemCom-Vision, bridging computer vision and communication engineering, and introduces for the first time a unified taxonomy based on three communication paradigms: semantic preservation, expansion, and refinement. Building upon this framework, the study integrates machine learning–driven codec architectures, semantic quantization mechanisms, knowledge structure modeling, and cross-domain collaborative optimization to establish an interdisciplinary design guideline. By clarifying the applicability of various approaches across different scenarios, this work advances semantic communication from theoretical exploration toward practical deployment, offering both foundational insights and a technical roadmap for future systems.
📝 Abstract
Semantic communication (SemCom) emerges as a transformative paradigm for traffic-intensive visual data transmission, shifting focus from raw data to meaningful content transmission and relieving the increasing pressure on communication resources. However, to achieve SemCom, challenges are faced in accurate semantic quantization for visual data, robust semantic extraction and reconstruction under diverse tasks and goals, transceiver coordination with effective knowledge utilization, and adaptation to unpredictable wireless communication environments. In this paper, we present a systematic review of SemCom for visual data transmission (SemCom-Vision), wherein an interdisciplinary analysis integrating computer vision (CV) and communication engineering is conducted to provide comprehensive guidelines for the machine learning (ML)-empowered SemCom-Vision design. Specifically, this survey first elucidates the basics and key concepts of SemCom. Then, we introduce a novel classification perspective to categorize existing SemCom-Vision approaches as semantic preservation communication (SPC), semantic expansion communication (SEC), and semantic refinement communication (SRC) based on communication goals interpreted through semantic quantization schemes. Moreover, this survey articulates the ML-based encoder-decoder models and training algorithms for each SemCom-Vision category, followed by knowledge structure and utilization strategies. Finally, we discuss potential SemCom-Vision applications.