CodeChemist: Functional Knowledge Transfer for Low-Resource Code Generation via Test-Time Scaling

📅 2025-10-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the poor code generation performance of low-resource programming languages caused by insufficient training data, this paper proposes CodeChemist—a novel framework that integrates test-time expansion with cross-lingual functional knowledge transfer, requiring no fine-tuning. Specifically, CodeChemist automatically generates and executes test cases in high-resource languages (e.g., Python), employs multi-temperature hedged sampling to produce diverse candidate programs, and performs cross-lingual filtering and re-ranking based on test pass rates to achieve precise functional semantic transfer. Experimental results demonstrate that CodeChemist significantly outperforms existing test-time expansion methods across multiple low-resource languages—including Rust, Go, and JavaScript—achieving an average 12.7% improvement in pass@1 on the HumanEval-X benchmark. This validates both its effectiveness and strong generalization capability across programming languages.

Technology Category

Application Category

📝 Abstract
Code Large Language Models (CodeLLMs) are increasingly used in code generation tasks across a wide range of applications. However, their performance is often inconsistent across different programming languages (PLs), with low-resource PLs suffering the most due to limited training data. In this paper, we present CodeChemist, a novel and efficient framework for test-time scaling that enables functional knowledge transfer from high-resource to low-resource PLs using generated test cases. CodeChemist first generates and executes code in high-resource PLs to create test cases that encapsulate functional knowledge. It then uses multi-temperature hedged sampling to generate code snippets in the low-resource PL and selects the best one based on the pass rate of the test cases. Our extensive experiments show that CodeChemist outperforms existing test-time scaling approaches, boosting the performance of code generation for low-resource PLs without requiring any model retraining.
Problem

Research questions and friction points this paper is trying to address.

Transferring functional knowledge from high-resource to low-resource programming languages
Improving code generation consistency across different programming languages
Enhancing low-resource language performance without model retraining
Innovation

Methods, ideas, or system contributions that make the work stand out.

Transfers functional knowledge via test cases
Uses multi-temperature hedged sampling technique
Selects best code via test case pass rate
🔎 Similar Papers
No similar papers found.
K
Kaixin Wang
Xi’an Jiaotong University
Tianlin Li
Tianlin Li
Nanyang Technological University
AI4SESE4AITrustworthy AI
X
Xiaoyu Zhang
Nanyang Technological University
A
Aishan Liu
Beihang University
X
Xianglong Liu
Beihang University
Z
Ziqi Liu
Ant Group
Z
Zhiqiang Zhang
Ant Group
J
Jun Zhou
Ant Group
Bin Shi
Bin Shi
Xi'an Jiaotong University
VirtualizationData Mining