🤖 AI Summary
This work addresses the prevailing focus on correctness in large language model (LLM)-based code generation while overlooking code readability—a subjective and challenging aspect to optimize. To bridge this gap, we propose the first multitask representation engineering (RepE) framework that jointly enhances both readability and correctness under low data dependency and computational cost. Theoretical analysis elucidates how multitask guidance influences the trade-off between these two objectives, overcoming the limitations of single-task control. Experimental results demonstrate that our approach significantly improves the readability of generated code without compromising high correctness rates. The implementation is publicly released to facilitate further research.
📝 Abstract
Correctness and readability are key measures of code quality, respectively ensuring functional fidelity and ease of comprehension. While most existing research focuses on improving the correctness of large language models~(LLMs) generated codes, readability remains under-addressed. Enhancing readability through targeted control is challenging due to its subjective nature. In this article, we employ representation engineering~(RepE) as the targeted control method given its characteristics of low data dependency and low computational cost. Prior work on RepE has primarily focused on the targeted control for a single task, but improving the code readability requires the control across multiple tasks. Accordingly we proposes the multitask RepE framework and theoretically discuss the impact of the multitask steering method on the tradeoff between the code readability and correctness. We further provide comprehensive experiments in support. All the relevant implementations are open-source and available upon request.