π€ AI Summary
To address the bottlenecks of high bandwidth consumption, significant latency, and weak privacy protection in cloud-edge collaborative inference for edge-intelligent vision deployment, this paper proposes FCTMβthe first standardized intermediate feature coding/decoding framework tailored for machine vision tasks, aligned with the MPEG Feature Coding Model (FCM) standard. Methodologically, FCTM integrates task-aware feature importance modeling, joint optimization of quantization and entropy coding, and lightweight inter-frame and inter-channel redundancy elimination, achieving a unified trade-off between semantic fidelity and high compression ratio. Evaluated across detection, segmentation, and recognition tasks, FCTM achieves an average bitrate reduction of 85.14% with negligible accuracy degradation (<0.3% mAP/mIoU), enabling real-time inference. End-to-end accuracy is fully preserved, and the framework has been officially validated by MPEG.
π Abstract
Deep neural networks (DNNs) drive modern machine vision but are challenging to deploy on edge devices due to high compute demands. Traditional approaches-running the full model on-device or offloading to the cloud face trade-offs in latency, bandwidth, and privacy. Splitting the inference workload between the edge and the cloud offers a balanced solution, but transmitting intermediate features to enable such splitting introduces new bandwidth challenges. To address this, the Moving Picture Experts Group (MPEG) initiated the Feature Coding for Machines (FCM) standard, establishing a bitstream syntax and codec pipeline tailored for compressing intermediate features. This paper presents the design and performance of the Feature Coding Test Model (FCTM), showing significant bitrate reductions-averaging 85.14%-across multiple vision tasks while preserving accuracy. FCM offers a scalable path for efficient and interoperable deployment of intelligent features in bandwidth-limited and privacy-sensitive consumer applications.