Extending Straight-Through Estimation for Robust Neural Networks on Analog CIM Hardware

📅 2025-08-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Analog compute-in-memory (CIM) hardware offers substantial energy efficiency gains for neural network inference, yet its deployment robustness is severely hindered by complex, non-ideal hardware noise. Existing noise-aware training relies on differentiable, oversimplified noise models that fail to capture realistic hardware distortions. To address this, we propose a decoupled training framework that separately models forward-pass hardware noise and backward-pass gradient computation. Crucially, we extend the straight-through estimator (STE) to support high-fidelity, non-differentiable noise modeling, with theoretical analysis guaranteeing gradient direction consistency. Our method significantly improves model resilience to hardware non-idealities: it achieves up to 5.3% higher accuracy on image classification, reduces perplexity by 0.72 on text generation, accelerates training by 2.2×, and cuts peak memory usage by 37.9%.

Technology Category

Application Category

📝 Abstract
Analog Compute-In-Memory (CIM) architectures promise significant energy efficiency gains for neural network inference, but suffer from complex hardware-induced noise that poses major challenges for deployment. While noise-aware training methods have been proposed to address this issue, they typically rely on idealized and differentiable noise models that fail to capture the full complexity of analog CIM hardware variations. Motivated by the Straight-Through Estimator (STE) framework in quantization, we decouple forward noise simulation from backward gradient computation, enabling noise-aware training with more accurate but computationally intractable noise modeling in analog CIM systems. We provide theoretical analysis demonstrating that our approach preserves essential gradient directional information while maintaining computational tractability and optimization stability. Extensive experiments show that our extended STE framework achieves up to 5.3% accuracy improvement on image classification, 0.72 perplexity reduction on text generation, 2.2$ imes$ speedup in training time, and 37.9% lower peak memory usage compared to standard noise-aware training methods.
Problem

Research questions and friction points this paper is trying to address.

Addressing hardware-induced noise in analog CIM neural networks
Improving noise-aware training with non-differentiable noise models
Enhancing accuracy and efficiency in analog CIM deployments
Innovation

Methods, ideas, or system contributions that make the work stand out.

Decouples forward noise and backward gradients
Uses accurate but intractable noise modeling
Improves accuracy and reduces training time
🔎 Similar Papers
No similar papers found.
Y
Yuannuo Feng
School of Integrated Circuit Science and Engineering, Beihang University, Beijing, China
Wenyong Zhou
Wenyong Zhou
The University of Hong Kong
Computer Vision
Y
Yuexi Lyu
Zhicun Research Lab, Beijing, China
Y
Yixiang Zhang
Zhicun Research Lab, Beijing, China
Zhengwu Liu
Zhengwu Liu
The University of Hong Kong (HKU) / Tsinghua University (THU)
brain machine interfacescomputing in memorymemristor
N
Ngai Wong
Department of Electrical and Electronic Engineering, The University of Hong Kong, Hong Kong
Wang Kang
Wang Kang
Beihang University
SpintronicsNonvolatile Memory and Logic CircuitsNon-Von Neumann Computing Architectures