BanglaForge: LLM Collaboration with Self-Refinement for Bangla Code Generation

πŸ“… 2025-12-22
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
To address data scarcity and tooling limitations for low-resource Bangla in code generation, this paper proposes Coder-Reviewerβ€”the first retrieval-augmented, dual-model collaborative framework. It integrates in-context learning, LLM-assisted translation, systematic prompt engineering, and an execution-feedback-driven multi-round self-refinement mechanism. The framework jointly optimizes natural language understanding and code robustness through encoder-reviewer co-modeling, with iterative corrections guided by program execution feedback. Evaluated on the newly constructed BLP-2025 benchmark, our approach achieves 84.00% Pass@1 accuracy, substantially outperforming existing baselines. This work introduces, for the first time, retrieval augmentation and execution-aware self-refinement to low-resource NL2Code tasks, establishing a scalable paradigm for code generation in resource-constrained languages.

Technology Category

Application Category

πŸ“ Abstract
Bangla is a low-resource language for code generation, lacking large-scale annotated datasets and tools to transform natural language specifications into executable programs. This makes Bangla-to-code generation a challenging task requiring innovative solutions. To address this, we introduce BanglaForge, a novel framework for generating code from Bangla function descriptions. BanglaForge leverages a retrieval-augmented dual-model collaboration paradigm with self-refinement, combining in-context learning, llm-based translation, systematic prompt engineering, and iterative self-refinement based on execution feedback, where a coder generates initial solutions and a reviewer enhances them for robustness. On the BLP-2025 Bangla Code Generation benchmark, BanglaForge achieves a competitive Pass@1 accuracy of 84.00%, demonstrating the effectiveness of retrieval, model collaboration, and self-refinement for low-resource Bangla code generation.
Problem

Research questions and friction points this paper is trying to address.

Generates code from Bangla natural language descriptions
Addresses low-resource challenges in Bangla code generation
Uses collaborative LLMs with self-refinement for robustness
Innovation

Methods, ideas, or system contributions that make the work stand out.

Retrieval-augmented dual-model collaboration paradigm
Self-refinement with execution feedback iteration
In-context learning and systematic prompt engineering
πŸ”Ž Similar Papers
No similar papers found.
Mahir Labib Dihan
Mahir Labib Dihan
CSE, BUET
Natural Language ProcessingLarge Language ModelsGeo Spatial
S
Sadif Ahmed
Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology (BUET), Dhaka, Bangladesh
M
Md Nafiu Rahman
Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology (BUET), Dhaka, Bangladesh