A Bit Level Weight Reordering Strategy Based on Column Similarity to Explore Weight Sparsity in RRAM-based NN Accelerator

📅 2025-11-18

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Exploiting weight sparsity in RRAM-based compute-in-memory (CIM) systems is challenging because unstructured sparsity disrupts the regular computational patterns of crossbar arrays. Method: This paper proposes a column-similarity-aware bit-level weight reordering technique. It models bit-level sparsity as inter-column bit-value consistency, enabling compact and structured mapping of sparse weights onto RRAM arrays by detecting and merging columns with identical bit values across weight bits. Integrated with two’s-complement encoding and column-similarity detection, the method ensures hardware efficiency while achieving high sparsity compression. Results: Evaluated on representative neural networks, the approach achieves an average 61.24% performance improvement and 1.51×–2.52× energy reduction, with modest overhead. It is the first work to efficiently unify sparsity exploitation and crossbar computational efficiency in RRAM CIM architectures.

Technology Category

Application Category

📝 Abstract

Compute-in-Memory (CIM) and weight sparsity are two effective techniques to reduce data movement during Neural Network (NN) inference. However, they can hardly be employed in the same accelerator simultaneously because CIM requires structural compute patterns which are disrupted in sparse NNs. In this paper, we partially solve this issue by proposing a bit level weight reordering strategy which can realize compact mapping of sparse NN weight matrices onto Resistive Random Access Memory (RRAM) based NN Accelerators (RRAM-Acc). In specific, when weights are mapped to RRAM crossbars in a binary complement manner, we can observe that, which can also be mathematically proven, bit-level sparsity and similarity commonly exist in the crossbars. The bit reordering method treats bit sparsity as a special case of bit similarity, reserve only one column in a pair of columns that have identical bit values, and then map the compressed weight matrices into Operation Units (OU). The performance of our design is evaluated with typical NNs. Simulation results show a 61.24% average performance improvement and 1.51x-2.52x energy savings under different sparsity ratios, with only slight overhead compared to the state-of-the-art design.

Problem

Research questions and friction points this paper is trying to address.

Resolving conflict between Compute-in-Memory and weight sparsity in neural accelerators

Enabling compact mapping of sparse weights onto RRAM-based NN accelerators

Leveraging bit-level sparsity and similarity to improve performance and energy efficiency

Innovation

Methods, ideas, or system contributions that make the work stand out.

Bit-level weight reordering strategy for RRAM accelerators

Exploits bit sparsity and similarity in crossbars

Compresses weight matrices by removing identical columns

🔎 Similar Papers

No similar papers found.

Authors to Follow