ESM: A Framework for Building Effective Surrogate Models for Hardware-Aware Neural Architecture Search

📅 2025-08-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Hardware-aware neural architecture search (NAS) suffers from low GPU latency prediction accuracy and high data acquisition costs. Method: This paper proposes the first end-to-end proxy model construction framework, systematically analyzing the entire proxy modeling pipeline to identify data generation bias and model generalization bottlenecks. It innovatively jointly optimizes automated sampling strategies and multi-regression model training while incorporating hardware-aware feature enhancement. Results: Experiments on NVIDIA GPU platforms demonstrate that the proposed method reduces average latency prediction error by 38.2% and cuts data collection overhead by 52%. Moreover, it exhibits strong cross-architecture and cross-device generalization capability, significantly improving search efficiency and reliability of NAS in resource-constrained scenarios.

Technology Category

Application Category

📝 Abstract
Hardware-aware Neural Architecture Search (NAS) is one of the most promising techniques for designing efficient Deep Neural Networks (DNNs) for resource-constrained devices. Surrogate models play a crucial role in hardware-aware NAS as they enable efficient prediction of performance characteristics (e.g., inference latency and energy consumption) of different candidate models on the target hardware device. In this paper, we focus on building hardware-aware latency prediction models. We study different types of surrogate models and highlight their strengths and weaknesses. We perform a systematic analysis to understand the impact of different factors that can influence the prediction accuracy of these models, aiming to assess the importance of each stage involved in the model designing process and identify methods and policies necessary for designing/training an effective estimation model, specifically for GPU-powered devices. Based on the insights gained from the analysis, we present a holistic framework that enables reliable dataset generation and efficient model generation, considering the overall costs of different stages of the model generation pipeline.
Problem

Research questions and friction points this paper is trying to address.

Develop surrogate models for hardware-aware NAS
Analyze factors affecting latency prediction accuracy
Create framework for efficient GPU-aware model generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hardware-aware latency prediction models for NAS
Systematic analysis of surrogate model accuracy
Holistic framework for dataset and model generation
🔎 Similar Papers
No similar papers found.