🤖 AI Summary
Existing automotive AI benchmarks lack systematic modeling of vehicle-specific constraints—such as functional safety and real-time latency—and no standardized, publicly available evaluation framework exists. Method: This paper introduces AutoBench, the first open-source ML accelerator benchmark tailored for autonomous driving perception tasks (2D/3D object detection and 2D semantic segmentation). It uniquely integrates functional safety requirements and end-to-end latency constraints into its evaluation methodology, defining a unified model submission specification, a joint accuracy–latency assessment protocol, and cross-platform comparable metrics. Contribution/Results: AutoBench provides reference implementations and initial evaluation results, filling a critical gap in standardized benchmarking for automotive ML acceleration. It significantly enhances transparency and reproducibility in hardware–software co-design for autonomous vehicles, enabling rigorous, constraint-aware evaluation of accelerators under realistic automotive conditions.
📝 Abstract
We present MLPerf Automotive, the first standardized public benchmark for evaluating Machine Learning systems that are deployed for AI acceleration in automotive systems. Developed through a collaborative partnership between MLCommons and the Autonomous Vehicle Computing Consortium, this benchmark addresses the need for standardized performance evaluation methodologies in automotive machine learning systems. Existing benchmark suites cannot be utilized for these systems since automotive workloads have unique constraints including safety and real-time processing that distinguish them from the domains that previously introduced benchmarks target. Our benchmarking framework provides latency and accuracy metrics along with evaluation protocols that enable consistent and reproducible performance comparisons across different hardware platforms and software implementations. The first iteration of the benchmark consists of automotive perception tasks in 2D object detection, 2D semantic segmentation, and 3D object detection. We describe the methodology behind the benchmark design including the task selection, reference models, and submission rules. We also discuss the first round of benchmark submissions and the challenges involved in acquiring the datasets and the engineering efforts to develop the reference implementations. Our benchmark code is available at https://github.com/mlcommons/mlperf_automotive.