Energy Efficient Software Hardware CoDesign for Machine Learning: From TinyML to Large Language Models

📅 2026-03-24
📈 Citations: 0
Influential: 0
📄 PDF
📝 Abstract
The rapid deployment of machine learning across platforms from milliwatt-class TinyML devices to large language models has made energy efficiency a primary constraint for sustainable AI. Across these scales, performance and energy are increasingly limited by data movement and memory-system behavior rather than by arithmetic throughput alone. This work reviews energy efficient software hardware codesign methods spanning edge inference and training to datacenter-scale LLM serving, covering accelerator architectures (e.g., ASIC/FPGA dataflows, processing-/compute-in-memory designs) and system-level techniques (e.g., partitioning, quantization, scheduling, and runtime adaptation). We distill common design levers and trade-offs, and highlight recurring gaps including limited cross-platform generalization, large and costly co-design search spaces, and inconsistent benchmarking across workloads and deployment settings. Finally, we outline a hierarchical decomposition perspective that maps optimization strategies to computational roles and supports incremental adaptation, offering practical guidance for building energy and carbon aware ML systems.
Problem

Research questions and friction points this paper is trying to address.

energy efficiency
machine learning
hardware-software co-design
data movement
memory system
Innovation

Methods, ideas, or system contributions that make the work stand out.

software-hardware co-design
energy efficiency
compute-in-memory
hierarchical decomposition
cross-platform optimization