🤖 AI Summary
AI reliability research is severely hindered by the scarcity of high-quality, standardized empirical data, impeding rigorous academic investigation. Method: We introduce DR-AIR—the first open-source, comprehensive data repository dedicated to AI reliability—featuring a novel taxonomy for AI reliability data and systematically integrating 23 high-quality datasets spanning computer vision, natural language processing, and reinforcement learning. These cover failure logs, robustness testing benchmarks, and multi-task/multi-model evaluation scenarios. DR-AIR employs standardized data modeling, rich metadata annotation, heterogeneous data fusion, and Git-versioned web APIs to ensure reproducibility and unified access. Contribution/Results: DR-AIR has been adopted in AI reliability curricula and research at five universities, significantly advancing community-driven, reproducible work in failure mode analysis, trustworthy evaluation, and robustness modeling. It establishes a foundational infrastructure for scalable, empirically grounded AI reliability science.
📝 Abstract
Artificial intelligence (AI) technology and systems have been advancing rapidly. However, ensuring the reliability of these systems is crucial for fostering public confidence in their use. This necessitates the modeling and analysis of reliability data specific to AI systems. A major challenge in AI reliability research, particularly for those in academia, is the lack of readily available AI reliability data. To address this gap, this paper focuses on conducting a comprehensive review of available AI reliability data and establishing DR-AIR: a data repository for AI reliability. Specifically, we introduce key measurements and data types for assessing AI reliability, along with the methodologies used to collect these data. We also provide a detailed description of the currently available datasets with illustrative examples. Furthermore, we outline the setup of the DR-AIR repository and demonstrate its practical applications. This repository provides easy access to datasets specifically curated for AI reliability research. We believe these efforts will significantly benefit the AI research community by facilitating access to valuable reliability data and promoting collaboration across various academic domains within AI. We conclude our paper with a call to action, encouraging the research community to contribute and share AI reliability data to further advance this critical field of study.