LLM-Assisted Translation of Legacy FORTRAN Codes to C++: A Cross-Platform Study

📅 2025-04-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the modernization of legacy Fortran codes in high-performance computing (HPC) by systematically evaluating the applicability and accuracy of large language models (LLMs) for cross-language translation (Fortran → C++). We propose the first reproducible proxy-based translation evaluation framework, quantifying performance across four dimensions: compilation correctness, semantic fidelity (measured via CodeBLEU), numerical consistency, and cross-platform compatibility (x86/ARM). Evaluated on diverse scientific computing benchmarks using open-source LLMs, our approach achieves up to 89% compilation success rate, 76% average semantic similarity relative to human-authored translations, and >92% numerical consistency. Our key contribution is establishing the first multi-dimensional evaluation paradigm for LLM-based translation of scientific code, empirically validating both its feasibility and inherent limitations in realistic HPC environments.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) are increasingly being leveraged for generating and translating scientific computer codes by both domain-experts and non-domain experts. Fortran has served as one of the go to programming languages in legacy high-performance computing (HPC) for scientific discoveries. Despite growing adoption, LLM-based code translation of legacy code-bases has not been thoroughly assessed or quantified for its usability. Here, we studied the applicability of LLM-based translation of Fortran to C++ as a step towards building an agentic-workflow using open-weight LLMs on two different computational platforms. We statistically quantified the compilation accuracy of the translated C++ codes, measured the similarity of the LLM translated code to the human translated C++ code, and statistically quantified the output similarity of the Fortran to C++ translation.
Problem

Research questions and friction points this paper is trying to address.

Assessing LLM-based Fortran to C++ translation accuracy
Comparing LLM-translated code with human-translated C++
Quantifying output similarity between Fortran and C++
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-assisted Fortran to C++ translation
Cross-platform compilation accuracy assessment
Statistical output similarity quantification
🔎 Similar Papers
No similar papers found.
Nishath Rajiv Ranasinghe
Nishath Rajiv Ranasinghe
Los Alamos National Laboratory
seismologyGeophysicsmachine learning
Shawn M. Jones
Shawn M. Jones
Los Alamos National Laboratory
Web ScienceDigital PreservationWeb Archiving@WebSciDL
Michal Kucer
Michal Kucer
Staff Scientist, Los Alamos National Laboratory
Computer VisionDeep LearningMachine Learning
A
Ayan Biswas
Los Alamos National Laboratory, Los Alamos NM 87545
Daniel O'Malley
Daniel O'Malley
Los Alamos National Laboratory
applied mathematicsmachine learningcomputational sciencequantum computing
A
Alexander Most
Los Alamos National Laboratory, Los Alamos NM 87545
S
Selma Liliane Wanna
Los Alamos National Laboratory, Los Alamos NM 87545
A
Ajay Sreekumar
School of Information, University of Arizona, 103 E 2nd St #4, Tucson, AZ 85721