Convolutions Predictable Offloading to an Accelerator: Formalization and Optimization

📅 2026-03-23

📈 Citations: 0

✨ Influential: 0

career value

219K/year

🤖 AI Summary

This work addresses the challenge of executing convolutional neural networks on memory-constrained, dedicated accelerators while meeting real-time requirements. To this end, the authors propose a formal framework for modeling convolution offloading sequences, recasting the scheduling problem as a constrained optimization task. By integrating convolution decomposition techniques, the framework constructs precise models of resource usage and latency constraints. A Python-based simulation platform is employed to systematically evaluate diverse scheduling strategies, enabling predictable and efficient utilization of accelerator resources. The proposed approach significantly enhances the execution efficiency of convolution operations, simultaneously satisfying real-time constraints and optimizing overall scheduling performance.

Technology Category

Application Category

📝 Abstract

Convolutional neural networks (CNNs) require a large number of multiply-accumulate (MAC) operations. To meet real-time constraints, they often need to be executed on specialized accelerators composed of an on-chip memory and a processing unit. However, the on-chip memory is often insufficient to store all the data required to compute a CNN layer. Thus, the computation must be performed in several offloading steps. We formalise such sequences of steps and apply our formalism to a state of the art decomposition of convolutions. In order to find optimal strategies in terms of duration, we encode the problem with a set of constraints. A Python-based simulator allows to analyse in-depth computed strategies.

Problem

Research questions and friction points this paper is trying to address.

convolutional neural networks

accelerator

offloading

on-chip memory

real-time constraints

Innovation

Methods, ideas, or system contributions that make the work stand out.

convolutional neural networks

predictable offloading

formalization