Seamless acceleration of Fortran intrinsics via AMD AI engines

📅 2025-02-14

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

To address scientific programmers’ dual requirements for performance and sustainability in high-performance computing (HPC), this paper proposes a source-code-transparent acceleration method for Fortran intrinsic functions—requiring no source modifications. It establishes, for the first time, an end-to-end compilation pipeline from the Flang compiler through the MLIR linear algebra dialect to the AMD AI Engine (AIE) dialect. The approach enables transparent offloading of standard-library computations (e.g., MATMUL, TRANSPOSE) to AIE cores on Ryzen AI CPU platforms while guaranteeing 100% Fortran source compatibility. Its core innovation lies in a cross-stack co-optimization chain spanning compiler, intermediate representation (IR), and hardware: MLIR serves as a semantic-preserving bridge, enabling precise mapping from Fortran intrinsics to AIE-native instructions and facilitating compile-time identification and offloading of compute-intensive operations. Evaluation on representative scientific workloads demonstrates speedups of 2.3–5.1× over CPU-only execution and up to 4.7× improvement in energy efficiency—delivering a zero-intrusion, compiler-driven solution for sustainable HPC acceleration.

Technology Category

Application Category

📝 Abstract

A major challenge that the HPC community faces is how to continue delivering the performance demanded by scientific programmers, whilst meeting an increased emphasis on sustainable operations. Specialised architectures, such as FPGAs and AMD's AI Engines (AIEs), have been demonstrated to provide significant energy efficiency advantages, however a major challenge is that to most effectively program these architectures requires significant expertise and investment of time which is a major blocker. Fortran in the lingua franca of scientific computing, and in this paper we explore automatically accelerating Fortran intrinsics via the AIEs in AMD's Ryzen AI CPU. Leveraging the open source Flang compiler and MLIR ecosystem, we describe an approach that lowers the MLIR linear algebra dialect to AMD's AIE dialects, and demonstrate that for suitable workloads the AIEs can provide significant performance advantages over the CPU without any code modifications required by the programmer.

Problem

Research questions and friction points this paper is trying to address.

Accelerate Fortran intrinsics efficiently

Leverage AMD AI engines for HPC

Minimize programming effort for performance gains

Innovation

Methods, ideas, or system contributions that make the work stand out.

AMD AI Engines acceleration

Flang compiler integration

MLIR dialect transformation

🔎 Similar Papers

No similar papers found.

Authors to Follow