Curiosity-Driven Testing for Sequential Decision-Making Process

📅 2025-09-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the susceptibility of safety-critical sequential decision-making systems to learning unsafe behaviors, this paper proposes a curiosity-driven black-box fuzzing framework for efficiently discovering diverse crash-triggering scenarios. Methodologically, it innovatively integrates an intrinsic curiosity mechanism with a multi-objective seed selection strategy to dynamically balance exploration of novel states and fault triggering. Technically, it incorporates deep learning–based mutation generation, prediction-uncertainty–guided novelty measurement, and Pareto-optimal seed scheduling. Experimental evaluation across multiple mainstream sequential decision models demonstrates that the approach significantly outperforms existing state-of-the-art methods: it achieves a 23.6% improvement in fault detection rate and a 41.2% increase in crash-scenario diversity. Moreover, the generated diverse failure cases facilitate subsequent model repair and robustness enhancement.

Technology Category

Application Category

📝 Abstract
Sequential decision-making processes (SDPs) are fundamental for complex real-world challenges, such as autonomous driving, robotic control, and traffic management. While recent advances in Deep Learning (DL) have led to mature solutions for solving these complex problems, SDMs remain vulnerable to learning unsafe behaviors, posing significant risks in safety-critical applications. However, developing a testing framework for SDMs that can identify a diverse set of crash-triggering scenarios remains an open challenge. To address this, we propose CureFuzz, a novel curiosity-driven black-box fuzz testing approach for SDMs. CureFuzz proposes a curiosity mechanism that allows a fuzzer to effectively explore novel and diverse scenarios, leading to improved detection of crashtriggering scenarios. Additionally, we introduce a multi-objective seed selection technique to balance the exploration of novel scenarios and the generation of crash-triggering scenarios, thereby optimizing the fuzzing process. We evaluate CureFuzz on various SDMs and experimental results demonstrate that CureFuzz outperforms the state-of-the-art method by a substantial margin in the total number of faults and distinct types of crash-triggering scenarios. We also demonstrate that the crash-triggering scenarios found by CureFuzz can repair SDMs, highlighting CureFuzz as a valuable tool for testing SDMs and optimizing their performance.
Problem

Research questions and friction points this paper is trying to address.

Testing sequential decision-making processes for unsafe behaviors
Identifying diverse crash-triggering scenarios in safety-critical applications
Developing effective fuzz testing framework for deep learning systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Curiosity-driven black-box fuzz testing approach
Multi-objective seed selection technique balancing exploration
Novel curiosity mechanism exploring diverse crash scenarios
🔎 Similar Papers
No similar papers found.
Junda He
Junda He
Singapore Management University
software engineering
Z
Zhou Yang
Singapore Management University, Singapore, Singapore
Jieke Shi
Jieke Shi
PhD Candidate & Research Engineer, Singapore Management University
Software EngineeringAI Software Testing
C
Chengran Yang
Singapore Management University, Singapore, Singapore
Kisub Kim
Kisub Kim
Assistant Professor @ DGIST, Korea
AI for Software EngineeringLarge Language ModelsSoftware AnalyticsManufacturing AI
B
Bowen Xu
North Carolina State University, Raleigh, United State
X
Xin Zhou
Singapore Management University, Singapore, Singapore
D
David Lo
Singapore Management University, Singapore, Singapore