AutoRAC: Automated Processing-in-Memory Accelerator Design for Recommender Systems

📅 2025-05-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Recommendation system DNN inference faces severe energy-efficiency bottlenecks. While Processing-in-Memory (PIM) architectures offer promising solutions, their design space is prohibitively large (>10⁵⁴), and operator-hardware co-mapping remains highly complex. This paper proposes an end-to-end automated PIM accelerator design methodology—the first to jointly search recommendation models and mixed-precision PIM architectures. We introduce a novel one-stage hypernetwork parameterization framework that unifies combinatorial optimization, PIM-aware operator mapping, mixed-precision interaction modeling, and RTL-level automatic synthesis. Evaluated on click-through rate (CTR) prediction, our approach achieves 3.36× inference speedup, 1.68× area reduction, and 12.48× energy-efficiency improvement over handcrafted designs. This work significantly advances the hardware deployment of high-efficiency recommendation systems.

Technology Category

Application Category

📝 Abstract
The performance bottleneck of deep-learning-based recommender systems resides in their backbone Deep Neural Networks. By integrating Processing-In-Memory~(PIM) architectures, researchers can reduce data movement and enhance energy efficiency, paving the way for next-generation recommender models. Nevertheless, achieving performance and efficiency gains is challenging due to the complexity of the PIM design space and the intricate mapping of operators. In this paper, we demonstrate that automated PIM design is feasible even within the most demanding recommender model design space, spanning over $10^{54}$ possible architectures. We propose methodname, which formulates the co-optimization of recommender models and PIM design as a combinatorial search over mixed-precision interaction operations, and parameterizes the search with a one-shot supernet encompassing all mixed-precision options. We comprehensively evaluate our approach on three Click-Through Rate benchmarks, showcasing the superiority of our automated design methodology over manual approaches. Our results indicate up to a 3.36$ imes$ speedup, 1.68$ imes$ area reduction, and 12.48$ imes$ higher power efficiency compared to naively mapped searched designs and state-of-the-art handcrafted designs.
Problem

Research questions and friction points this paper is trying to address.

Addressing performance bottleneck in deep-learning-based recommender systems
Overcoming complexity in Processing-in-Memory design space exploration
Automating co-optimization of recommender models and PIM architectures
Innovation

Methods, ideas, or system contributions that make the work stand out.

Automated PIM design for recommender systems
Combinatorial search over mixed-precision operations
One-shot supernet parameterizes all design options
🔎 Similar Papers
No similar papers found.
F
Feng Cheng
Duke University
Tunhou Zhang
Tunhou Zhang
Duke University
J
Junyao Zhang
Duke University
J
J. Ku
Duke University
Yitu Wang
Yitu Wang
Duke University
Xiaoxuan Yang
Xiaoxuan Yang
University of Virginia
In-Memory ComputingComputer-Aided DesginMachine Learning Acceleration
H
Hai Li
Duke University
Y
Yiran Chen
Duke University