PyResBugs: A Dataset of Residual Python Bugs for Natural Language-Driven Fault Injection

๐Ÿ“… 2025-05-09
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the critical lack of real-world, production-grade defect data for AI-driven automated testing. To this end, we introduce PyResBugsโ€”the first natural-language-annotated dataset targeting residual bugs in Python: those that evade conventional testing and manifest only in production environments. Methodologically, we systematically collect defect pairs (faulty/patched versions) from mainstream Python frameworks, ensuring rigorous human annotation, version alignment, and multi-level validation. Each defect is accompanied by fine-grained natural language descriptions covering root cause, triggering conditions, and observable exception behavior. Our core contribution is the first precise mapping from natural language specifications to executable faults, thereby bridging the long-standing gap between NL-driven fault injection and production-representative defects. Empirical evaluation demonstrates that PyResBugs significantly enhances the generalizability and practical utility of AI-based testing tools in both real-defect detection and controllable fault injection tasks.

Technology Category

Application Category

๐Ÿ“ Abstract
This paper presents PyResBugs, a curated dataset of residual bugs, i.e., defects that persist undetected during traditional testing but later surface in production, collected from major Python frameworks. Each bug in the dataset is paired with its corresponding fault-free (fixed) version and annotated with multi-level natural language (NL) descriptions. These NL descriptions enable natural language-driven fault injection, offering a novel approach to simulating real-world faults in software systems. By bridging the gap between software fault injection techniques and real-world representativeness, PyResBugs provides researchers with a high-quality resource for advancing AI-driven automated testing in Python systems.
Problem

Research questions and friction points this paper is trying to address.

Dataset of residual Python bugs undetected in testing
Enables natural language-driven fault injection
Advances AI-driven automated testing in Python
Innovation

Methods, ideas, or system contributions that make the work stand out.

Curated dataset of residual Python bugs
Multi-level natural language descriptions
Natural language-driven fault injection
๐Ÿ”Ž Similar Papers
No similar papers found.