Optimization of Layer Skipping and Frequency Scaling for Convolutional Neural Networks under Latency Constraint

📅 2025-03-31

📈 Citations: 0

✨ Influential: 0

career value

173K/year

🤖 AI Summary

To address the energy-efficiency–latency trade-off in deploying CNNs on resource-constrained devices (e.g., mobile and autonomous driving platforms), this paper proposes a co-optimization framework integrating layer skipping with dynamic voltage and frequency scaling (DVFS). We introduce proportional layer skipping (PLS), a novel mechanism that models the layer-skipping ratio as a continuous, tunable parameter—jointly optimized with CPU frequency. This enables fine-grained, hardware-aware inference acceleration. Further, we formulate a tri-objective optimization framework balancing inference latency, energy consumption, and model accuracy—overcoming the limitations of conventional single-dimension model compression. Evaluations on ResNet-152 over CIFAR-10 demonstrate that our method reduces energy consumption by 42.7% and computational operations by 38.5%, while maintaining bounded inference latency and incurring only a 0.9% accuracy drop relative to the baseline.

Technology Category

Application Category

📝 Abstract

The energy consumption of Convolutional Neural Networks (CNNs) is a critical factor in deploying deep learning models on resource-limited equipment such as mobile devices and autonomous vehicles. We propose an approach involving Proportional Layer Skipping (PLS) and Frequency Scaling (FS). Layer skipping reduces computational complexity by selectively bypassing network layers, whereas frequency scaling adjusts the frequency of the processor to optimize energy use under latency constraints. Experiments of PLS and FS on ResNet-152 with the CIFAR-10 dataset demonstrated significant reductions in computational demands and energy consumption with minimal accuracy loss. This study offers practical solutions for improving real-time processing in resource-limited settings and provides insights into balancing computational efficiency and model performance.

Problem

Research questions and friction points this paper is trying to address.

Optimize CNN energy use on resource-limited devices

Balance computational efficiency and model performance

Reduce energy consumption under latency constraints

Innovation

Methods, ideas, or system contributions that make the work stand out.

Proportional Layer Skipping reduces complexity

Frequency Scaling optimizes energy use

Combined method lowers energy with minimal accuracy loss

🔎 Similar Papers

No similar papers found.