From Prompts to Performance: Evaluating LLMs for Task-based Parallel Code Generation

📅 2026-02-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study presents the first systematic evaluation of large language models’ (LLMs’) ability to generate efficient task-parallel code across three prominent parallel programming frameworks: OpenMP Tasking, the C++ Standard Parallel Library, and the HPX runtime. We compare the correctness and strong/weak scalability of code generated under three prompting strategies—natural language descriptions, serial reference implementations, and parallel pseudocode. Our experiments reveal that while LLMs perform adequately on simple parallelization tasks, they exhibit significant limitations in handling complex task dependencies and fine-grained control constructs. Moreover, the choice of parallel framework substantially influences the quality of the generated code. These findings provide empirical evidence and delineate the practical boundaries for leveraging LLMs in high-performance computing contexts.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLM) show strong abilities in code generation, but their skill in creating efficient parallel programs is less studied. This paper explores how LLMs generate task-based parallel code from three kinds of input prompts: natural language problem descriptions, sequential reference implementations, and parallel pseudo code. We focus on three programming frameworks: OpenMP Tasking, C++ standard parallelism, and the asynchronous many-task runtime HPX. Each framework offers different levels of abstraction and control for task execution. We evaluate LLM-generated solutions for correctness and scalability. Our results reveal both strengths and weaknesses of LLMs with regard to problem complexity and framework. Finally, we discuss what these findings mean for future LLM-assisted development in high-performance and scientific computing.
Problem

Research questions and friction points this paper is trying to address.

Large Language Models
Parallel Code Generation
Task-based Parallelism
Code Correctness
Scalability
Innovation

Methods, ideas, or system contributions that make the work stand out.

Large Language Models
Task-based Parallelism
Code Generation
Prompt Engineering
High-Performance Computing
🔎 Similar Papers
L
Linus Bantel
Institute for Parallel and Distributed Systems, University of Stuttgart, Stuttgart, Germany
M
Moritz Strack
Institute for Parallel and Distributed Systems, University of Stuttgart, Stuttgart, Germany
A
Alexander Strack
Institute for Parallel and Distributed Systems, University of Stuttgart, Stuttgart, Germany
Dirk Pflüger
Dirk Pflüger
University of Stuttgart
Scientific ComputingHigh-Performance ComputingHigh-Dimensional ApproximationNumerical Machine Learning