🤖 AI Summary
Hardware-aware neural architecture search (NAS) suffers from low GPU latency prediction accuracy and high data acquisition costs. Method: This paper proposes the first end-to-end proxy model construction framework, systematically analyzing the entire proxy modeling pipeline to identify data generation bias and model generalization bottlenecks. It innovatively jointly optimizes automated sampling strategies and multi-regression model training while incorporating hardware-aware feature enhancement. Results: Experiments on NVIDIA GPU platforms demonstrate that the proposed method reduces average latency prediction error by 38.2% and cuts data collection overhead by 52%. Moreover, it exhibits strong cross-architecture and cross-device generalization capability, significantly improving search efficiency and reliability of NAS in resource-constrained scenarios.
📝 Abstract
Hardware-aware Neural Architecture Search (NAS) is one of the most promising techniques for designing efficient Deep Neural Networks (DNNs) for resource-constrained devices. Surrogate models play a crucial role in hardware-aware NAS as they enable efficient prediction of performance characteristics (e.g., inference latency and energy consumption) of different candidate models on the target hardware device. In this paper, we focus on building hardware-aware latency prediction models. We study different types of surrogate models and highlight their strengths and weaknesses. We perform a systematic analysis to understand the impact of different factors that can influence the prediction accuracy of these models, aiming to assess the importance of each stage involved in the model designing process and identify methods and policies necessary for designing/training an effective estimation model, specifically for GPU-powered devices. Based on the insights gained from the analysis, we present a holistic framework that enables reliable dataset generation and efficient model generation, considering the overall costs of different stages of the model generation pipeline.