"ENERGY STAR"LLM-Enabled Software Engineering Tools

📅 2026-01-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the significant energy consumption introduced by AI-augmented software engineering tools that enable large language models (LLMs) by default, which can adversely impact energy efficiency throughout the development lifecycle. To mitigate this, we present the first systematic evaluation and optimization of energy efficiency in LLM-driven tools, proposing a synergistic approach that integrates retrieval-augmented generation (RAG) with prompt engineering techniques (PETs) to reduce energy usage while preserving code generation quality. We develop an evaluation framework capable of real-time monitoring of inference time and energy consumption, covering a range of models from 125M to 7B parameters—including GPT-2, CodeLlama, Qwen 2.5, and DeepSeek Coder. Experimental results validate the effectiveness of our method, offering a reproducible benchmark and empirical foundation for green AI in software engineering.

Technology Category

Application Category

📝 Abstract
The discussion around AI-Engineering, that is, Software Engineering (SE) for AI-enabled Systems, cannot ignore a crucial class of software systems that are increasingly becoming AI-enhanced: Those used to enable or support the SE process, such as Computer-Aided SE (CASE) tools and Integrated Development Environments (IDEs). In this paper, we study the energy efficiency of these systems. As AI becomes seamlessly available in these tools and, in many cases, is active by default, we are entering a new era with significant implications for energy consumption patterns throughout the Software Development Lifecycle (SDLC). We focus on advanced Machine Learning (ML) capabilities provided by Large Language Models (LLMs). Our proposed approach combines Retrieval-Augmented Generation (RAG) with Prompt Engineering Techniques (PETs) to enhance both the quality and energy efficiency of LLM-based code generation. We present a comprehensive framework that measures real-time energy consumption and inference time across diverse model architectures ranging from 125M to 7B parameters, including GPT-2, CodeLlama, Qwen 2.5, and DeepSeek Coder. These LLMs, chosen for practical reasons, are sufficient to validate the core ideas and provide a proof of concept for more in-depth future analysis.
Problem

Research questions and friction points this paper is trying to address.

energy efficiency
Large Language Models
Software Engineering Tools
AI-Engineering
SDLC
Innovation

Methods, ideas, or system contributions that make the work stand out.

Retrieval-Augmented Generation
Prompt Engineering
Energy Efficiency
Large Language Models
Code Generation
🔎 Similar Papers
No similar papers found.