🤖 AI Summary
To address low data transmission efficiency and high terminal power consumption in edge-cloud collaborative AI inference, this paper proposes a novel edge-cloud co-inference framework based on intermediate neural feature encoding and compression. The method introduces intermediate-layer neural features as standardized coding units into the MPEG-AI international standard—Feature Coding for Machines (FCM)—enabling lossless, feature-level collaborative inference. It integrates lightweight feature extraction, adaptive quantization-based compression, and joint edge-cloud scheduling. Experimental results demonstrate that, compared to conventional remote inference, the framework reduces transmission bitrate by 75.90% while preserving model accuracy without degradation. It significantly improves inference latency and terminal energy efficiency, offering an efficient and practical solution for deploying large-scale models on resource-constrained, low-power devices.
📝 Abstract
As consumer devices become increasingly intelligent and interconnected, efficient data transfer solutions for machine tasks have become essential. This paper presents an overview of the latest Feature Coding for Machines (FCM) standard, part of MPEG-AI and developed by the Moving Picture Experts Group (MPEG). FCM supports AI-driven applications by enabling the efficient extraction, compression, and transmission of intermediate neural network features. By offloading computationally intensive operations to base servers with high computing resources, FCM allows low-powered devices to leverage large deep learning models. Experimental results indicate that the FCM standard maintains the same level of accuracy while reducing bitrate requirements by 75.90% compared to remote inference.