π€ AI Summary
This work addresses the inefficiency of conventional adaptive voltage scaling techniques, which overlook the inherent fault tolerance of deep neural networks (DNNs), leading to excessive voltage margins that accelerate circuit aging and increase power consumption. To overcome this limitation, the authors propose an aging-aware voltage scaling methodology that explicitly incorporates DNN fault tolerance into a full-lifetime aging prediction framework. By integrating history effect modeling with iterative extrapolation, the approach dynamically defers unnecessary voltage increases. Compared to traditional strategies, the proposed method substantially reduces conservative voltage margins while maintaining reliability, achieving 19.4% and 19.1% reductions in threshold voltage shifts for PMOS and NMOS transistors, respectively. Aging-induced degradation is mitigated by up to 45.8% (NMOS) and 30.6% (PMOS), resulting in a 14.0% reduction in average lifetime power consumption.
π Abstract
Deep neural networks (DNNs) have showcased remarkable performance across various tasks and are widely deployed on AI accelerators fabricated in advanced technology nodes for efficiency. As aging effects become more pronounced, timing and voltage guardbands are increasingly applied. Aging-aware adaptive voltage scaling (AVS), which adjusts supply voltage based on on-chip aging scenarios, has emerged as a promising solution to avoid excessive guardbanding. However, conventional AVS techniques overlook the inherent resilience of DNNs and frequently raise the supply voltage unnecessarily, thereby exacerbating aging and increasing power consumption. To enable reliable and efficient AI inference with AVS, in this paper, we develop an accurate aging prediction framework that incorporates historical effects and iterative extrapolation for full-lifetime modeling. Building on this framework, we propose a fault-tolerant voltage scaling policy that exploits DNN resilience and defers voltage increases accordingly. Experiments show that our framework mitigates the pessimism of maximum-voltage baselines, reducing predicted threshold voltage shift (ΞVth) by 19.4% for PMOS and 19.1% for NMOS, respectively. Furthermore, evaluation on representative DNN workloads demonstrates that our optimization reduces aging degradation by up to 45.8% (NMOS) and 30.6% (PMOS) while achieving 14.0% average lifetime power savings compared to resilience-agnostic methods.