AMP4EC: Adaptive Model Partitioning Framework for Efficient Deep Learning Inference in Edge Computing Environments

📅 2025-04-01

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

To address the inference efficiency bottleneck of deep learning models under resource-constrained edge computing environments, this paper proposes a dynamic, adaptive model partitioning framework. The framework performs real-time resource-aware partitioning at fine-grained layer granularity and enables runtime reconfiguration, overcoming limitations of conventional static partitioning and fixed deployment strategies. It integrates lightweight resource monitoring, latency-aware partitioning decision-making, and end-edge collaborative scheduling, and achieves cross-platform compatibility via ONNX and TensorRT. Experimental evaluation on edge devices—including Raspberry Pi and Jetson Nano—demonstrates up to 78% reduction in end-side inference latency and a 414% increase in throughput, significantly outperforming baseline approaches. These results validate both the effectiveness and generalizability of the proposed method across heterogeneous edge platforms.

Technology Category

Application Category

📝 Abstract

Edge computing enables efficient deep learning inference in resource-constrained environments. In this paper, we propose AMP4EC, an adaptive model partitioning framework that optimizes inference by dynamically partitioning deep learning models based on real-time resource availability. Our approach achieves a latency reduction of up to 78% and a throughput improvement of 414% compared to baseline methods.

Problem

Research questions and friction points this paper is trying to address.

Optimizes deep learning inference in edge computing

Dynamically partitions models for resource efficiency

Reduces latency and improves throughput significantly

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive partitioning of deep learning models

Dynamic optimization based on resource availability

Significant latency reduction and throughput improvement

🔎 Similar Papers

No similar papers found.

Authors to Follow