Energy Efficient Software Hardware CoDesign for Machine Learning: From TinyML to Large Language Models

📅 2026-03-24

📈 Citations: 0

✨ Influential: 0

📄 PDF

career value

210K/year

Technology Category

Application Category

📝 Abstract

The rapid deployment of machine learning across platforms from milliwatt-class TinyML devices to large language models has made energy efficiency a primary constraint for sustainable AI. Across these scales, performance and energy are increasingly limited by data movement and memory-system behavior rather than by arithmetic throughput alone. This work reviews energy efficient software hardware codesign methods spanning edge inference and training to datacenter-scale LLM serving, covering accelerator architectures (e.g., ASIC/FPGA dataflows, processing-/compute-in-memory designs) and system-level techniques (e.g., partitioning, quantization, scheduling, and runtime adaptation). We distill common design levers and trade-offs, and highlight recurring gaps including limited cross-platform generalization, large and costly co-design search spaces, and inconsistent benchmarking across workloads and deployment settings. Finally, we outline a hierarchical decomposition perspective that maps optimization strategies to computational roles and supports incremental adaptation, offering practical guidance for building energy and carbon aware ML systems.

Problem

Research questions and friction points this paper is trying to address.

energy efficiency

machine learning

hardware-software co-design

data movement

memory system

Innovation

Methods, ideas, or system contributions that make the work stand out.

software-hardware co-design

energy efficiency

compute-in-memory