Toward Green Code: Prompting Small Language Models for Energy-Efficient Code Generation

πŸ“… 2025-09-11
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Large language models (LLMs) incur substantial energy consumption and carbon emissions, hindering sustainable software development. Method: This study investigates prompt engineering as a green optimization lever for small language models (SLMs) in code generation, evaluating four open-source SLMsβ€”Qwen2.5-Coder, StableCode-3B, CodeLlama-7B, and Phi-3-Mini-4Kβ€”on LeetCode Python tasks under role-based, zero-shot, few-shot, and chain-of-thought (CoT) prompting. We measure runtime, memory footprint, and empirically measured energy consumption. Contribution/Results: CoT prompting significantly reduces energy consumption for certain SLMs, with strong model-specificity: Qwen2.5-Coder and StableCode-3B achieve consistent energy savings surpassing human baseline performance, whereas CodeLlama-7B and Phi-3-Mini-4K fail to meet it. This work is the first to empirically demonstrate CoT as an effective, energy-efficient prompting paradigm for SLMs in code generation, establishing a reproducible, low-power pathway for AI-assisted programming.

Technology Category

Application Category

πŸ“ Abstract
There is a growing concern about the environmental impact of large language models (LLMs) in software development, particularly due to their high energy use and carbon footprint. Small Language Models (SLMs) offer a more sustainable alternative, requiring fewer computational resources while remaining effective for fundamental programming tasks. In this study, we investigate whether prompt engineering can improve the energy efficiency of SLMs in code generation. We evaluate four open-source SLMs, StableCode-Instruct-3B, Qwen2.5-Coder-3B-Instruct, CodeLlama-7B-Instruct, and Phi-3-Mini-4K-Instruct, across 150 Python problems from LeetCode, evenly distributed into easy, medium, and hard categories. Each model is tested under four prompting strategies: role prompting, zero-shot, few-shot, and chain-of-thought (CoT). For every generated solution, we measure runtime, memory usage, and energy consumption, comparing the results with a human-written baseline. Our findings show that CoT prompting provides consistent energy savings for Qwen2.5-Coder and StableCode-3B, while CodeLlama-7B and Phi-3-Mini-4K fail to outperform the baseline under any prompting strategy. These results highlight that the benefits of prompting are model-dependent and that carefully designed prompts can guide SLMs toward greener software development.
Problem

Research questions and friction points this paper is trying to address.

Improving energy efficiency of small language models in code generation
Evaluating prompt engineering strategies for sustainable software development
Assessing model-specific energy savings with chain-of-thought prompting
Innovation

Methods, ideas, or system contributions that make the work stand out.

Using prompt engineering for energy efficiency
Evaluating four SLMs across multiple prompting strategies
Measuring runtime memory and energy consumption metrics
πŸ”Ž Similar Papers
No similar papers found.
H
Humza Ashraf
Algoma University, Brampton, Canada
Syed Muhammad Danish
Syed Muhammad Danish
Assistant Professor, Algoma University
BlockchainElectric VehiclesSecurity and PrivacyEnergy Efficient SystemsSustainable AI
Z
Zeeshan Sattar
Ericsson Inc., Ottawa, Canada