🤖 AI Summary
To address the scarcity of high-quality instruction data and the prohibitive cost of manual construction in code generation, this paper proposes a tri-model co-evolutionary synthetic framework: an Instructor-LLM generates instructions, a Coder-LLM produces corresponding code, and a Judge-LLM automatically evaluates correctness; genetic operators—mutation, selection, and crossover—are applied to instructions. This work introduces the first Instructor-Coder-Judge co-evolution paradigm, enabling cold-start training with weak models and offering strong scalability and parallelism. Starting from only a small set of seed instructions, the framework efficiently synthesizes millions of high-quality instruction-code pairs. Experiments yield over 7.5 million samples; fine-tuning LLMs on this data significantly improves code generation performance, outperforming existing synthetic approaches and public datasets on benchmarks including HumanEval.
📝 Abstract
Large Language Models (LLMs) require high quality instruction data for effective alignment, particularly in code generation tasks where expert curated datasets are expensive to produce. We present Genetic-Instruct, a scalable algorithm for synthesizing large-scale, high quality coding instructions using evolutionary principles. Starting from a small set of seed instructions, Genetic-Instruct generates diverse and challenging instruction-code pairs by leveraging an Instructor-LLM for generation, a Coder-LLM for code synthesis, and a Judge-LLM for automatic quality evaluation. Our proposed approach is highly parallelizable and effective even with a small seed data and weaker generator models. We generated more than 7.5 million coding instructions with the proposed approach. Then we evaluated it by fine-tuning LLMs with the synthetic samples and demonstrated a significant improvement in their code generation capability compared to the other synthetic generation approaches and publicly available datasets. Our results highlight the efficiency, scalability, and generalizability of the Genetic-Instruct framework.