🤖 AI Summary
This work extends ACETONE, originally limited to generating sequential C code, to support multicore parallel code generation, thereby unlocking the parallel performance potential of multicore embedded platforms. By formally modeling the processor allocation problem, designing a scheduling heuristic, and introducing customized synchronization mechanism templates, the approach enables efficient mapping and execution of neural network layers across multicore architectures. The framework also integrates worst-case execution time (WCET) analysis, offering a predictable, efficient, and verifiable deployment solution for deep learning in safety-critical systems.
📝 Abstract
As the industry's interest in machine learning has grown in recent years, some solutions have emerged to safely embed them in safety-critical systems, such as the C code generator ACETONE. However, this framework is limited to generating sequential code, which cannot make most of the multi-core architectures. In this paper, we initiate an extension of ACETONE for the generation of parallel code by formally defining our processor assignment problem and surveying the state of the art on existing solutions. In the final paper, we will introduce the completed extension, including the implementation of the scheduling heuristic, the creation of templates implementing synchronization mechanisms, and an evaluation of the worst-case execution time of the framework's layers.