MapCoder-Lite: Squeezing Multi-Agent Coding into a Single Small LLM

📅 2025-09-22

📈 Citations: 0

✨ Influential: 0

career value

175K/year

🤖 AI Summary

Existing small-model multi-agent code generation approaches face two key bottlenecks: reliance on large foundation models (>30B parameters) or failure to function effectively on open-source small language models (e.g., 7B). This paper proposes a lightweight three-stage framework that, for the first time, enables stable, role-separated multi-agent collaborative programming within a single 7B language model. First, strong-model trajectory distillation mitigates format fragility; second, supervised error correction strengthens planning and coding capabilities; third, LoRA-based agent-level fine-tuning specializes four distinct roles—retrieval, planning, coding, and debugging—with <3% additional parameters. On xCodeEval, our method improves accuracy from 13.2% to 28.3%, eliminates format errors entirely, and approaches the performance of a 32B baseline while reducing GPU memory consumption and inference latency by 4×.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) have advanced code generation from single-function tasks to competitive-programming problems, but existing multi-agent solutions either rely on costly large-scale ($>$ 30B) models or collapse when downsized to small open-source models. We present MapCoder-Lite, which upgrades a single 7B model into four role-specialised agents-retriever, planner, coder, and debugger-using only rank-32, role-specific LoRA adapters ($<3%$ extra parameters). Three lightweight techniques make this possible: (i) trajectory distillation from strong LLMs fixes format fragility in retrieval and debugging, (ii) supervisor-guided correction strengthens planning and coding agents, and (iii) agent-wise LoRA fine-tuning delivers memory-efficient specialisation. Comprehensive evaluation on xCodeEval, APPS, and CodeContests shows that MapCoder-Lite more than doubles xCodeEval accuracy (from $13.2%$ to $28.3%$), eliminates all format failures, and closes to within six points of a 32B baseline while cutting GPU memory and token-generation time by $4 imes$. These results demonstrate that careful agent-wise fine-tuning unleashes high-quality multi-agent coding on a small language model.

Problem

Research questions and friction points this paper is trying to address.

Multi-agent coding solutions fail on small language models

Existing approaches require costly large-scale models to work

Small models collapse when handling complex multi-agent coding tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Role-specific LoRA adapters for agent specialization

Trajectory distillation to fix format fragility

Supervisor-guided correction for planning and coding

🔎 Similar Papers

No similar papers found.