AutoRAC: Automated Processing-in-Memory Accelerator Design for Recommender Systems

📅 2025-05-15

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Recommendation system DNN inference faces severe energy-efficiency bottlenecks. While Processing-in-Memory (PIM) architectures offer promising solutions, their design space is prohibitively large (>10⁵⁴), and operator-hardware co-mapping remains highly complex. This paper proposes an end-to-end automated PIM accelerator design methodology—the first to jointly search recommendation models and mixed-precision PIM architectures. We introduce a novel one-stage hypernetwork parameterization framework that unifies combinatorial optimization, PIM-aware operator mapping, mixed-precision interaction modeling, and RTL-level automatic synthesis. Evaluated on click-through rate (CTR) prediction, our approach achieves 3.36× inference speedup, 1.68× area reduction, and 12.48× energy-efficiency improvement over handcrafted designs. This work significantly advances the hardware deployment of high-efficiency recommendation systems.

Technology Category

Application Category

📝 Abstract

The performance bottleneck of deep-learning-based recommender systems resides in their backbone Deep Neural Networks. By integrating Processing-In-Memory~(PIM) architectures, researchers can reduce data movement and enhance energy efficiency, paving the way for next-generation recommender models. Nevertheless, achieving performance and efficiency gains is challenging due to the complexity of the PIM design space and the intricate mapping of operators. In this paper, we demonstrate that automated PIM design is feasible even within the most demanding recommender model design space, spanning over $10^{54}$ possible architectures. We propose methodname, which formulates the co-optimization of recommender models and PIM design as a combinatorial search over mixed-precision interaction operations, and parameterizes the search with a one-shot supernet encompassing all mixed-precision options. We comprehensively evaluate our approach on three Click-Through Rate benchmarks, showcasing the superiority of our automated design methodology over manual approaches. Our results indicate up to a 3.36$ imes$ speedup, 1.68$ imes$ area reduction, and 12.48$ imes$ higher power efficiency compared to naively mapped searched designs and state-of-the-art handcrafted designs.

Problem

Research questions and friction points this paper is trying to address.

Addressing performance bottleneck in deep-learning-based recommender systems

Overcoming complexity in Processing-in-Memory design space exploration

Automating co-optimization of recommender models and PIM architectures

Innovation

Methods, ideas, or system contributions that make the work stand out.

Automated PIM design for recommender systems

Combinatorial search over mixed-precision operations

One-shot supernet parameterizes all design options

🔎 Similar Papers

No similar papers found.

Authors to Follow