Leveraging Neural Graph Compilers in Machine Learning Research for Edge-Cloud Systems

📅 2025-04-28

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Neural graph compilers distort hardware performance evaluation in edge-cloud heterogeneous systems, leading to erroneous architectural decisions when compiler effects are ignored. Method: We propose a compilation-aware ML research paradigm, employing cross-platform (CPU/GPU/TPU/NPU) fine-grained benchmarking, block-level performance attribution analysis, and batch-size sensitivity modeling. Contribution/Results: Our study systematically reveals how compilers invert relative hardware rankings, amplify performance gains for simple model structures, and induce strong dependence on model topology and batch size. We introduce the novel “compilation friction coefficient” to quantify batch-induced performance degradation; pioneer the integration of compilation impact into end-to-end ML system design; and deliver reproducible, deployable co-optimization guidelines. Experimental results demonstrate that vendor-supplied compilers can completely reverse hardware performance rankings—neglecting compilation effects risks severe misselection of target architectures.

Technology Category

Application Category

📝 Abstract

This work presents a comprehensive evaluation of neural network graph compilers across heterogeneous hardware platforms, addressing the critical gap between theoretical optimization techniques and practical deployment scenarios. We demonstrate how vendor-specific optimizations can invalidate relative performance comparisons between architectural archetypes, with performance advantages sometimes completely reversing after compilation. Our systematic analysis reveals that graph compilers exhibit performance patterns highly dependent on both neural architecture and batch sizes. Through fine-grained block-level experimentation, we establish that vendor-specific compilers can leverage repeated patterns in simple architectures, yielding disproportionate throughput gains as model depth increases. We introduce novel metrics to quantify a compiler's ability to mitigate performance friction as batch size increases. Our methodology bridges the gap between academic research and practical deployment by incorporating compiler effects throughout the research process, providing actionable insights for practitioners navigating complex optimization landscapes across heterogeneous hardware environments.

Problem

Research questions and friction points this paper is trying to address.

Evaluating neural network graph compilers on heterogeneous hardware platforms

Analyzing performance impacts of vendor-specific optimizations on neural architectures

Quantifying compiler ability to reduce performance friction with batch size

Innovation

Methods, ideas, or system contributions that make the work stand out.

Neural graph compilers for edge-cloud systems

Vendor-specific optimizations impact performance comparisons

Novel metrics quantify compiler batch size efficiency

🔎 Similar Papers

No similar papers found.

Authors to Follow