DOSA: Differentiable Model-Based One-Loop Search for DNN Accelerators

📅 2025-09-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Joint optimization of the hardware design space and algorithmic mapping space suffers from combinatorial explosion. This paper proposes DOSA, the first framework to formulate co-search of hardware and mapping as a differentiable optimization problem. DOSA constructs a differentiable performance model by synergistically integrating analytical modeling with learned components, enabling gradient-based continuous optimization. The framework is modular and supports end-to-end joint optimization of buffer configurations, dataflow mappings, and hardware parameters. Experiments demonstrate that, under identical sampling budgets, DOSA reduces energy-delay product by 2.80× and 12.59× over random search and Bayesian optimization, respectively; in real accelerator synthesis, it achieves a 1.82× improvement in energy-delay efficiency. The core contribution is establishing a hardware-mapping co-differentiable modeling paradigm, overcoming the limitations of conventional staged optimization approaches.

Technology Category

Application Category

📝 Abstract
In the hardware design space exploration process, it is critical to optimize both hardware parameters and algorithm-to-hardware mappings. Previous work has largely approached this simultaneous optimization problem by separately exploring the hardware design space and the mapspace - both individually large and highly nonconvex spaces - independently. The resulting combinatorial explosion has created significant difficulties for optimizers. In this paper, we introduce DOSA, which consists of differentiable performance models and a gradient descent-based optimization technique to simultaneously explore both spaces and identify high-performing design points. Experimental results demonstrate that DOSA outperforms random search and Bayesian optimization by 2.80x and 12.59x, respectively, in improving DNN model energy-delay product, given a similar number of samples. We also demonstrate the modularity and flexibility of DOSA by augmenting our analytical model with a learned model, allowing us to optimize buffer sizes and mappings of a real DNN accelerator and attain a 1.82x improvement in energy-delay product.
Problem

Research questions and friction points this paper is trying to address.

Simultaneously optimizing hardware parameters and algorithm-to-hardware mappings
Addressing combinatorial explosion in hardware design space exploration
Improving energy-delay product for DNN accelerator designs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Differentiable performance models for hardware and mappings
Gradient descent-based simultaneous space optimization
Modular analytical and learned model integration
🔎 Similar Papers
No similar papers found.
C
Charles Hong
University of California, Berkeley
Q
Qijing Huang
NVIDIA
G
Grace Dinh
University of California, Berkeley
Mahesh Subedar
Mahesh Subedar
Intel Labs
Machine LearningGenerative AIHardware AcceleratorsVideo Post ProcessingSoC
Yakun Sophia Shao
Yakun Sophia Shao
Associate Professor, UC Berkeley
Computer ArchitectureVLSI