ARCANE: Adaptive RISC-V Cache Architecture for Near-memory Extensions

📅 2025-04-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the high data-movement overhead and low energy efficiency inherent in von Neumann architectures—and the limited data-placement flexibility of existing near-memory computing (NMC) approaches—this paper proposes an adaptive RISC-V cache architecture. Our method extends the cache controller into a programmable near-memory co-processor, integrating a custom instruction set and an embedded vector unit to enable CPU-offloaded computation directly within the cache hierarchy. A tightly coupled design ensures automatic memory synchronization and transparent data mapping, thereby eliminating restrictive data-layout requirements typical of conventional NMC. The architecture jointly achieves hardware acceleration efficiency and software extensibility: under worst-case 32-bit CNN workloads, it delivers 30×–84× speedup for 8-bit computations while incurring only a 41.3% area overhead.

Technology Category

Application Category

📝 Abstract
Modern data-driven applications expose limitations of von Neumann architectures - extensive data movement, low throughput, and poor energy efficiency. Accelerators improve performance but lack flexibility and require data transfers. Existing compute in- and near-memory solutions mitigate these issues but face usability challenges due to data placement constraints. We propose a novel cache architecture that doubles as a tightly-coupled compute-near-memory coprocessor. Our iscv cache controller executes custom instructions from the host CPU using vector operations dispatched to near-memory vector processing units within the cache memory subsystem. This architecture abstracts memory synchronization and data mapping from application software while offering software-based isa extensibility. Our implementation shows $30 imes$ to $84 imes$ performance improvement when operating on 8-bit data over the same system with a traditional cache when executing a worst-case 32-bit CNN workload, with only $41.3%$ area overhead.
Problem

Research questions and friction points this paper is trying to address.

Reduces data movement in von Neumann architectures
Enables flexible near-memory computation
Improves performance and energy efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive RISC-V cache for near-memory compute
Vector operations in cache memory subsystem
Software-based ISA extensibility with synchronization
🔎 Similar Papers
No similar papers found.