'1'-bit Count-based Sorting Unit to Reduce Link Power in DNN Accelerators

๐Ÿ“… 2026-01-20
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF

career value

183K/year
๐Ÿค– AI Summary
This work addresses the high interconnect power consumption in deep neural network (DNN) accelerators caused by excessive switching activity. To mitigate this issue, the authors propose a comparison-free hardware sorting architecture based on counting the number of โ€œ1โ€ bits in data words, which reorders operands to minimize bit toggling across communication links. To balance energy efficiency and hardware overhead, an approximation-aware, coarse-grained bucket grouping strategy is introduced. This approach maintains an average 19.50% reduction in bit switching while significantly reducing the area of the sorting unit. Experimental results demonstrate that, in convolutional neural network (CNN)-oriented scenarios, the proposed method achieves up to 35.4% hardware area savings, effectively reconciling power optimization with resource efficiency.

Technology Category

Application Category

๐Ÿ“ Abstract
Interconnect power consumption remains a bottleneck in Deep Neural Network (DNN) accelerators. While ordering data based on'1'-bit counts can mitigate this via reduced switching activity, practical hardware sorting implementations remain underexplored. This work proposes the hardware implementation of a comparison-free sorting unit optimized for Convolutional Neural Networks (CNN). By leveraging approximate computing to group population counts into coarse-grained buckets, our design achieves hardware area reductions while preserving the link power benefits of data reordering. Our approximate sorting unit achieves up to 35.4% area reduction while maintaining 19.50\% BT reduction compared to 20.42% of precise implementation.
Problem

Research questions and friction points this paper is trying to address.

interconnect power
DNN accelerators
1-bit count
sorting unit
switching activity
Innovation

Methods, ideas, or system contributions that make the work stand out.

1-bit count sorting
approximate computing
DNN accelerator
interconnect power reduction
comparison-free sorting
๐Ÿ”Ž Similar Papers
No similar papers found.