🤖 AI Summary
Autonomous driving safety verification faces two key bottlenecks: the oracle problem and insufficient scenario coverage. To address these, this paper proposes an enhanced metamorphic testing (MT) framework integrated with generative AI—specifically, the first incorporation of Stable Diffusion into MT—to enable semantically preserved, fine-grained scene mutations (e.g., weather, road markings, environmental elements) within the operational design domain (ODD). Unlike conventional MT relying on simple syntactic transformations, our approach automatically generates thousands of semantically consistent, high-fidelity, reproducible, and generalizable test cases. Experimental results demonstrate substantial improvements in edge-case coverage and robustness assessment capability. The framework establishes a scalable, interpretable, and principled paradigm for safety validation of autonomous driving systems.
📝 Abstract
Self-driving cars have the potential to revolutionize transportation, but ensuring their safety remains a significant challenge. These systems must navigate a variety of unexpected scenarios on the road, and their complexity poses substantial difficulties for thorough testing. Conventional testing methodologies face critical limitations, including the oracle problem determining whether the systems behavior is correct and the inability to exhaustively recreate a range of situations a self-driving car may encounter. While Metamorphic Testing (MT) offers a partial solution to these challenges, its application is often limited by simplistic modifications to test scenarios. In this position paper, we propose enhancing MT by integrating AI-driven image generation tools, such as Stable Diffusion, to improve testing methodologies. These tools can generate nuanced variations of driving scenarios within the operational design domain (ODD)for example, altering weather conditions, modifying environmental elements, or adjusting lane markings while preserving the critical features necessary for system evaluation. This approach enables reproducible testing, efficient reuse of test criteria, and comprehensive evaluation of a self-driving systems performance across diverse scenarios, thereby addressing key gaps in current testing practices.