ESM: A Framework for Building Effective Surrogate Models for Hardware-Aware Neural Architecture Search

📅 2025-08-02

📈 Citations: 0

✨ Influential: 0

career value

230K/year

🤖 AI Summary

Hardware-aware neural architecture search (NAS) suffers from low GPU latency prediction accuracy and high data acquisition costs. Method: This paper proposes the first end-to-end proxy model construction framework, systematically analyzing the entire proxy modeling pipeline to identify data generation bias and model generalization bottlenecks. It innovatively jointly optimizes automated sampling strategies and multi-regression model training while incorporating hardware-aware feature enhancement. Results: Experiments on NVIDIA GPU platforms demonstrate that the proposed method reduces average latency prediction error by 38.2% and cuts data collection overhead by 52%. Moreover, it exhibits strong cross-architecture and cross-device generalization capability, significantly improving search efficiency and reliability of NAS in resource-constrained scenarios.

Technology Category

Application Category

📝 Abstract

Hardware-aware Neural Architecture Search (NAS) is one of the most promising techniques for designing efficient Deep Neural Networks (DNNs) for resource-constrained devices. Surrogate models play a crucial role in hardware-aware NAS as they enable efficient prediction of performance characteristics (e.g., inference latency and energy consumption) of different candidate models on the target hardware device. In this paper, we focus on building hardware-aware latency prediction models. We study different types of surrogate models and highlight their strengths and weaknesses. We perform a systematic analysis to understand the impact of different factors that can influence the prediction accuracy of these models, aiming to assess the importance of each stage involved in the model designing process and identify methods and policies necessary for designing/training an effective estimation model, specifically for GPU-powered devices. Based on the insights gained from the analysis, we present a holistic framework that enables reliable dataset generation and efficient model generation, considering the overall costs of different stages of the model generation pipeline.

Problem

Research questions and friction points this paper is trying to address.

Develop surrogate models for hardware-aware NAS

Analyze factors affecting latency prediction accuracy

Create framework for efficient GPU-aware model generation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hardware-aware latency prediction models for NAS

Systematic analysis of surrogate model accuracy

Holistic framework for dataset and model generation

🔎 Similar Papers

Graph is all you need? Lightweight data-agnostic neural architecture search without training