Condor: A Code Discriminator Integrating General Semantics with Code Details

๐Ÿ“… 2024-12-23
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 2
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
To address the challenge of accurately identifying correct implementations among multiple code candidates generated by large language models (LLMs), this paper proposes a fine-grained, highly flexible non-execution-based code discriminator. Methodologically, it introduces a novel integration of contrastive learning with intermediate-state modeling of code edits, constructing semantically progressive code pairs to enrich training data and designing a Transformer-based discriminator to enable fine-grained representation alignment. This framework overcomes the limitation of conventional discriminators in capturing subtle semantic distinctions. Experiments demonstrate significant improvements: +6.0 percentage points in F1 score on CodeNanoFix; Pass@1 of 62.63% (+10.0%) on HumanEval using Llama-3.1 (70B); and a 147.05% gain in Pass@1 on APPSโ€”validating its strong generalization capability and practical utility.

Technology Category

Application Category

๐Ÿ“ Abstract
LLMs demonstrate significant potential across various software engineering tasks. However, they still face challenges in generating correct code on the first attempt when addressing complex requirements. Introducing a discriminator to select reliable outputs from multiple generated results is an effective way to enhance their reliability and stability. Currently, these discriminators fall into two categories: execution-based discriminators and non-execution-based discriminators. Execution-based discriminators face flexibility challenges due to difficulties in obtaining test cases and security concerns, while non-execution-based discriminators, although more flexible, struggle to capture subtle differences in code details. To maintain flexibility while improving the model's ability to capture fine-grained code details, this paper proposes Condor. We first design contrastive learning to optimize the code representations of the base model, enabling it to reflect differences in code details. Then, we leverage intermediate data from the code modification process to further enrich the discriminator's training data, enhancing its ability to discern code details. Experimental results indicate that on the subtle code difference dataset (i.e., CodeNanoFix), Condor significantly outperforms other discriminators in discriminative performance: Condor (1.3B) improves the discriminative F1 score of DeepSeek-Coder (1.3B) from 67% to 73%. In discriminating LLM-generated outputs, Condor (1.3B) and Condor (110M) raise the Pass@1 score of Meta-Llama-3.1-Instruct (70B) on the CodeNanoFix dataset from 52.64% to 62.63% and 59.64%, respectively. Moreover, Condor demonstrates strong generalization capabilities on the MBPP and APPS datasets. For example, Condor (1.3B) improves the Pass@1 of Meta-Llama-3.1-Instruct (70B) on the APPS dataset by 147.05%.
Problem

Research questions and friction points this paper is trying to address.

Enhances LLM code reliability by selecting correct outputs
Improves discriminator flexibility while capturing fine-grained code details
Optimizes code representations using contrastive learning and intermediate data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Contrastive learning optimizes code representation for detail differences.
Intermediate code modification data enriches discriminator training.
Condor discriminator enhances LLM code generation accuracy significantly.
๐Ÿ”Ž Similar Papers
No similar papers found.
Qingyuan Liang
Qingyuan Liang
Peking University
Software EngineeringCode Generation
Z
Zhao Zhang
Key Lab of HCST (PKU), MOE; SCS, Peking University, China
C
Chen Liu
Key Lab of HCST (PKU), MOE; SCS, Peking University, China
Z
Zeyu Sun
National Key Laboratory of Space Integrated Information System, Institute of Software, Chinese Academy of Sciences, China
W
Wenjie Zhang
National University of Singapore, Singapore
Yizhou Chen
Yizhou Chen
Peking University
AI4SEVulnerability DetectionFormal Verification
Z
Zixiao Zhao
Key Lab of HCST (PKU), MOE; SCS, Peking University, China
Q
Qi Luo
Southern University of Science and Technology, China
W
Wentao Wang
Key Lab of HCST (PKU), MOE; SCS, Peking University, China
Yanjie Jiang
Yanjie Jiang
Tianjin University
software refactoring and testing
Yingfei Xiong
Yingfei Xiong
Associate Professor, Peking University
Software EngineeringProgramming LanguagesProgram RepairProgram SynthesisProgram Analysis
L
Lu Zhang
Key Lab of HCST (PKU), MOE; SCS, Peking University, China