Model-Agnostic Correctness Assessment for LLM-Generated Code via Dynamic Internal Representation Selection

📅 2025-10-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing LLM code correctness evaluation methods rely on fixed representations from predefined layers or positions, exhibiting poor generalizability. This paper proposes AUTOPROBE: the first model-agnostic, white-box evaluation framework that dynamically selects critical internal representations. AUTOPROBE leverages attention mechanisms to compute importance scores for hidden states across all layers and token positions, followed by weighted aggregation; a lightweight probe classifier then jointly predicts code compilability, functionality, and security. Extensive experiments across multiple LLMs and benchmarks demonstrate that AUTOPROBE outperforms state-of-the-art white-box methods by 18% on security assessment, and achieves improvements of 19% and 111% on compilability and functionality evaluation, respectively. Moreover, it significantly enhances cross-architectural and cross-task robustness and adaptability, establishing a new foundation for interpretable, generalizable code correctness assessment.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) have demonstrated impressive capabilities in code generation and are increasingly integrated into the software development process. However, ensuring the correctness of LLM-generated code remains a critical concern. Prior work has shown that the internal representations of LLMs encode meaningful signals for assessing code correctness. Nevertheless, the existing methods rely on representations from pre-selected/fixed layers and token positions, which could limit its generalizability across diverse model architectures and tasks. In this work, we introduce AUTOPROBE, a novel model-agnostic approach that dynamically selects the most informative internal representations for code correctness assessment. AUTOPROBE employs an attention-based mechanism to learn importance scores for hidden states, enabling it to focus on the most relevant features. These weighted representations are then aggregated and passed to a probing classifier to predict code correctness across multiple dimensions, including compilability, functionality, and security. To evaluate the performance of AUTOPROBE, we conduct extensive experiments across multiple benchmarks and code LLMs. Our experimental results show that AUTOPROBE consistently outperforms the baselines. For security assessment, AUTOPROBE surpasses the state-of-the-art white-box approach by 18%. For compilability and functionality assessment, AUTOPROBE demonstrates its highest robustness to code complexity, with the performance higher than the other approaches by up to 19% and 111%, respectively. These findings highlight that dynamically selecting important internal signals enables AUTOPROBE to serve as a robust and generalizable solution for assessing the correctness of code generated by various LLMs.
Problem

Research questions and friction points this paper is trying to address.

Dynamically selecting informative internal representations for code correctness
Assessing compilability, functionality, and security of LLM-generated code
Providing model-agnostic correctness evaluation across diverse architectures
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic selection of informative internal representations for code assessment
Attention-based mechanism learns importance scores for hidden states
Probing classifier predicts correctness across multiple code dimensions
Thanh Trong Vu
Thanh Trong Vu
VNU
Software EngineeringAutomated Software Engineering
T
Tuan-Dung Bui
Faculty of Information Technology, VNU University of Engineering and Technology, Hanoi, Vietnam
Thu-Trang Nguyen
Thu-Trang Nguyen
VNU University of Engineering and Technology
Automated Software EngineeringProgram AnalysisCode GenerationAI
S
Son Nguyen
Faculty of Information Technology, VNU University of Engineering and Technology, Hanoi, Vietnam
Hieu Dinh Vo
Hieu Dinh Vo
VNU
Software architectureProgram analysis