🤖 AI Summary
Large pre-trained models incur substantial energy consumption and environmental impact, yet existing model compression research primarily prioritizes accuracy preservation without quantifying real-world electricity usage. This work establishes, for the first time, a direct empirical link between model compression techniques and measured power consumption. We systematically evaluate three classes of structural compression—pruning, low-rank decomposition, and steganographic capacity reduction—across nine pre-trained models (8M–138M parameters) under standardized training conditions. Results show that steganographic capacity reduction achieves an average 37% reduction in training energy consumption with <0.8% accuracy degradation, whereas conventional pruning and low-rank decomposition yield negligible energy savings. We introduce a reproducible, hardware-level power monitoring experimental framework and uncover a nonlinear relationship between structural compression pathways and energy efficiency. This work provides a foundational methodology and empirical evidence for green AI, shifting the evaluation paradigm from pure accuracy to energy-aware model design.
📝 Abstract
Increasingly complex neural network architectures have achieved phenomenal performance. However, these complex models require massive computational resources that consume substantial amounts of electricity, which highlights the potential environmental impact of such models. Previous studies have demonstrated that substantial redundancies exist in large pre-trained models. However, previous work has primarily focused on compressing models while retaining comparable model performance, and the direct impact on electricity consumption appears to have received relatively little attention. By quantifying the energy usage associated with both uncompressed and compressed models, we investigate compression as a means of reducing electricity consumption. We consider nine different pre-trained models, ranging in size from 8M parameters to 138M parameters. To establish a baseline, we first train each model without compression and record the electricity usage and time required during training, along with other relevant statistics. We then apply three compression techniques: Steganographic capacity reduction, pruning, and low-rank factorization. In each of the resulting cases, we again measure the electricity usage, training time, model accuracy, and so on. We find that pruning and low-rank factorization offer no significant improvements with respect to energy usage or other related statistics, while steganographic capacity reduction provides major benefits in almost every case. We discuss the significance of these findings.