A Bit Level Weight Reordering Strategy Based on Column Similarity to Explore Weight Sparsity in RRAM-based NN Accelerator

📅 2025-11-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Exploiting weight sparsity in RRAM-based compute-in-memory (CIM) systems is challenging because unstructured sparsity disrupts the regular computational patterns of crossbar arrays. Method: This paper proposes a column-similarity-aware bit-level weight reordering technique. It models bit-level sparsity as inter-column bit-value consistency, enabling compact and structured mapping of sparse weights onto RRAM arrays by detecting and merging columns with identical bit values across weight bits. Integrated with two’s-complement encoding and column-similarity detection, the method ensures hardware efficiency while achieving high sparsity compression. Results: Evaluated on representative neural networks, the approach achieves an average 61.24% performance improvement and 1.51×–2.52× energy reduction, with modest overhead. It is the first work to efficiently unify sparsity exploitation and crossbar computational efficiency in RRAM CIM architectures.

Technology Category

Application Category

📝 Abstract
Compute-in-Memory (CIM) and weight sparsity are two effective techniques to reduce data movement during Neural Network (NN) inference. However, they can hardly be employed in the same accelerator simultaneously because CIM requires structural compute patterns which are disrupted in sparse NNs. In this paper, we partially solve this issue by proposing a bit level weight reordering strategy which can realize compact mapping of sparse NN weight matrices onto Resistive Random Access Memory (RRAM) based NN Accelerators (RRAM-Acc). In specific, when weights are mapped to RRAM crossbars in a binary complement manner, we can observe that, which can also be mathematically proven, bit-level sparsity and similarity commonly exist in the crossbars. The bit reordering method treats bit sparsity as a special case of bit similarity, reserve only one column in a pair of columns that have identical bit values, and then map the compressed weight matrices into Operation Units (OU). The performance of our design is evaluated with typical NNs. Simulation results show a 61.24% average performance improvement and 1.51x-2.52x energy savings under different sparsity ratios, with only slight overhead compared to the state-of-the-art design.
Problem

Research questions and friction points this paper is trying to address.

Resolving conflict between Compute-in-Memory and weight sparsity in neural accelerators
Enabling compact mapping of sparse weights onto RRAM-based NN accelerators
Leveraging bit-level sparsity and similarity to improve performance and energy efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

Bit-level weight reordering strategy for RRAM accelerators
Exploits bit sparsity and similarity in crossbars
Compresses weight matrices by removing identical columns
🔎 Similar Papers
No similar papers found.
W
Weiping Yang
College of Electronic Science and Technology, National University of Defense Technology, Changsha, China
Shilin Zhou
Shilin Zhou
School of Computer Science and Technology, Soochow University
Machine LearningNatural Language Processing
H
Hui Xu
College of Electronic Science and Technology, National University of Defense Technology, Changsha, China
Y
Yujiao Nie
College of Electronic Science and Technology, National University of Defense Technology, Changsha, China
Q
Qimin Zhou
College of Electronic Science and Technology, National University of Defense Technology, Changsha, China
Z
Zhiwei Li
College of Electronic Science and Technology, National University of Defense Technology, Changsha, China
C
Changlin Chen
College of Electronic Science and Technology, National University of Defense Technology, Changsha, China