🤖 AI Summary
This work addresses three key bottlenecks in SRAM-based in-memory computing: the large area overhead of analog-to-digital converters (ADCs), high latency for multi-bit inputs, and limited read bitline voltage. To overcome these challenges, the authors propose a highly reconfigurable 256×128 in-memory compute array supporting 1–7-bit inputs, 2–4-bit weights, and 1–7-bit outputs. The design integrates an in-memory ADC (IMADC) with only 3% area overhead, a charge-sharing-based bit-sliced column hybrid adder (BSCHA) acceleration mechanism, dual 8T ternary-weight storage cells, and an under-driven cascaded read technique to jointly optimize computational efficiency and precision. Compared to state-of-the-art approaches, the IMADC reduces area by 9×, BSCHA achieves 1.9× and 6.6× speedup over PWM and bit-slicing schemes respectively, cell discharge current linearity improves by 7×, and the usable read bitline voltage increases by 3.5×.
📝 Abstract
SRAM-based analog computing-in-memory demonstrates outstanding efficiency. However, it faces three critical challenges: significant ADC overhead, high latency for multi-bit inputs, and limited read bitline voltage. To address these issues, this work proposes a multi-bit highly reconfigurable 256x128 in-memory computing array supporting 1-7b input, 2-4b weight, and 1-7b output. Three key innovations are introduced: 1) The IMADC occupies only 3% area overhead, achieving a 9x improvement compared to previous IMADC; 2) The BSCHA reduces latency by 1.9x and 6.6x compared to traditional pulse-width modulation (PWM) and bit-slicing modes, respectively; 3) A dual-8T bitcell enabling ternary weight storage through a decoupled read path, integrated with a read wordline under-driven cascode technique, improves linearity of unit discharge current by 7x and increases the usable read bitline voltage by 3.5x.