Can Large Language Models Understand Intermediate Representations?

📅 2025-02-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work investigates large language models’ (LLMs) capacity to comprehend compiler intermediate representations (IR), specifically evaluating their performance on four core tasks: control-flow graph (CFG) reconstruction, decompilation, code summarization, and execution reasoning. Through systematic multi-model benchmarking (GPT-4, LLaMA 3.1, Gemma 2, etc.), a curated structured IR dataset, task-specific prompt engineering, and fine-grained error attribution, the study provides the first empirical evidence of fundamental limitations in LLMs’ IR understanding—particularly in CFG reconstruction (accuracy <42%) and execution reasoning (error rate 68%). Methodologically, it introduces a dual-path enhancement paradigm: (1) IR-domain fine-tuning and (2) explicit control-flow modeling. Experimental results demonstrate that targeted fine-tuning improves task performance by up to 31.5%, establishing a foundational framework for advancing LLM-based IR analysis.

Technology Category

Application Category

📝 Abstract
Intermediate Representations (IRs) are essential in compiler design and program analysis, yet their comprehension by Large Language Models (LLMs) remains underexplored. This paper presents a pioneering empirical study to investigate the capabilities of LLMs, including GPT-4, GPT-3, Gemma 2, LLaMA 3.1, and Code Llama, in understanding IRs. We analyze their performance across four tasks: Control Flow Graph (CFG) reconstruction, decompilation, code summarization, and execution reasoning. Our results indicate that while LLMs demonstrate competence in parsing IR syntax and recognizing high-level structures, they struggle with control flow reasoning, execution semantics, and loop handling. Specifically, they often misinterpret branching instructions, omit critical IR operations, and rely on heuristic-based reasoning, leading to errors in CFG reconstruction, IR decompilation, and execution reasoning. The study underscores the necessity for IR-specific enhancements in LLMs, recommending fine-tuning on structured IR datasets and integration of explicit control flow models to augment their comprehension and handling of IR-related tasks.
Problem

Research questions and friction points this paper is trying to address.

LLMs' understanding of Intermediate Representations.
Challenges in control flow and execution reasoning.
Need for IR-specific enhancements in LLMs.
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLMs analyze Intermediate Representations
Fine-tuning on structured IR datasets
Integration of explicit control flow models