Seamless acceleration of Fortran intrinsics via AMD AI engines

📅 2025-02-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address scientific programmers’ dual requirements for performance and sustainability in high-performance computing (HPC), this paper proposes a source-code-transparent acceleration method for Fortran intrinsic functions—requiring no source modifications. It establishes, for the first time, an end-to-end compilation pipeline from the Flang compiler through the MLIR linear algebra dialect to the AMD AI Engine (AIE) dialect. The approach enables transparent offloading of standard-library computations (e.g., MATMUL, TRANSPOSE) to AIE cores on Ryzen AI CPU platforms while guaranteeing 100% Fortran source compatibility. Its core innovation lies in a cross-stack co-optimization chain spanning compiler, intermediate representation (IR), and hardware: MLIR serves as a semantic-preserving bridge, enabling precise mapping from Fortran intrinsics to AIE-native instructions and facilitating compile-time identification and offloading of compute-intensive operations. Evaluation on representative scientific workloads demonstrates speedups of 2.3–5.1× over CPU-only execution and up to 4.7× improvement in energy efficiency—delivering a zero-intrusion, compiler-driven solution for sustainable HPC acceleration.

Technology Category

Application Category

📝 Abstract
A major challenge that the HPC community faces is how to continue delivering the performance demanded by scientific programmers, whilst meeting an increased emphasis on sustainable operations. Specialised architectures, such as FPGAs and AMD's AI Engines (AIEs), have been demonstrated to provide significant energy efficiency advantages, however a major challenge is that to most effectively program these architectures requires significant expertise and investment of time which is a major blocker. Fortran in the lingua franca of scientific computing, and in this paper we explore automatically accelerating Fortran intrinsics via the AIEs in AMD's Ryzen AI CPU. Leveraging the open source Flang compiler and MLIR ecosystem, we describe an approach that lowers the MLIR linear algebra dialect to AMD's AIE dialects, and demonstrate that for suitable workloads the AIEs can provide significant performance advantages over the CPU without any code modifications required by the programmer.
Problem

Research questions and friction points this paper is trying to address.

Accelerate Fortran intrinsics efficiently
Leverage AMD AI engines for HPC
Minimize programming effort for performance gains
Innovation

Methods, ideas, or system contributions that make the work stand out.

AMD AI Engines acceleration
Flang compiler integration
MLIR dialect transformation
🔎 Similar Papers
No similar papers found.
Nick Brown
Nick Brown
Senior Research Fellow, EPCC at the University of Edinburgh
HPCFPGAsRISC-Vcompilersnovel architectures
G
Gabriel Rodr'iguez Canal
EPCC at the University of Edinburgh, Edinburgh, UK