🤖 AI Summary
The scarcity of large-scale, standardized engine audio datasets with precise operating condition annotations has hindered advances in active sound design and data-driven synthesis. This work proposes an analysis-driven procedural generation framework that extracts harmonic structures from real-world recordings via pitch-adaptive spectral analysis and drives an extended parametric harmonic-plus-noise synthesizer, enabling sample-level control over RPM and torque. The resulting Procedural Engine Sounds Dataset comprises 19 hours of audio across 5,935 samples, spanning a wide range of operating conditions and acoustic complexities. This dataset effectively addresses the gap in real-world data availability while preserving authentic harmonic characteristics, thereby supporting learning-based parameter estimation and audio synthesis tasks.
📝 Abstract
Computational engine sound modeling is central to the automotive audio industry, particularly for active sound design, virtual prototyping, and emerging data-driven engine sound synthesis methods. These applications require large volumes of standardized, clean audio recordings with precisely time-aligned operating-state annotations: data that is difficult to obtain due to high costs, specialized measurement equipment requirements, and inevitable noise contamination. We present an analysis-driven framework for generating engine audio with sample-accurate control annotations. The method extracts harmonic structures from real recordings through pitch-adaptive spectral analysis, which then drive an extended parametric harmonic-plus-noise synthesizer. With this framework, we generate the Procedural Engine Sounds Dataset (19 hours, 5,935 files), a set of engine audio signals with sample-accurate RPM and torque annotations, spanning a wide range of operating conditions, signal complexities, and harmonic profiles. Comparison against real recordings validates that the synthesized data preserves characteristic harmonic structures, and baseline experiments confirm its suitability for learning-based parameter estimation and synthesis tasks. The dataset is released publicly to support research on engine timbre analysis, control parameter estimation, acoustic modeling and neural generative networks.