🤖 AI Summary
Despite growing adoption of AI-driven test automation tools, their real-world efficacy—particularly in improving test efficiency, reducing maintenance costs, and enhancing defect detection—remains inadequately evaluated. Method: We conduct a systematic literature review identifying 55 tools and propose the first taxonomy of AI testing capabilities; further, we perform a dual-tool, dual-system empirical study on open-source projects, evaluating core functionalities including UI self-healing, visual testing, and intelligent test case generation. Contribution/Results: AI tools improve execution efficiency and reduce maintenance effort by over 30%, yet suffer from high false-positive rates, insufficient domain knowledge integration, and strong model dependency. This work establishes the first benchmarking framework for AI-based testing that jointly integrates a comprehensive capability taxonomy with multi-dimensional empirical validation—providing foundational guidance for developing robust, interpretable, and production-ready AI testing tools.
📝 Abstract
Context: The rise of Artificial Intelligence (AI) in software engineering has led to the development of AI-powered test automation tools, promising improved efficiency, reduced maintenance effort, and enhanced defect-detection. However, a systematic evaluation of these tools is needed to understand their capabilities, benefits, and limitations. Objective: This study has two objectives: (1) A systematic review of AI-assisted test automation tools, categorizing their key AI features; (2) an empirical study of two selected AI-powered tools on two software under test, to investigate the effectiveness and limitations of the tools. Method: A systematic review of 55 AI-based test automation tools was conducted, classifying them based on their AI-assisted capabilities such as self-healing tests, visual testing, and AI-powered test generation. In the second phase, two representative tools were selected for the empirical study, in which we applied them to test two open-source software systems. Their performance was compared with traditional test automation approaches to evaluate efficiency and adaptability. Results: The review provides a comprehensive taxonomy of AI-driven testing tools, highlighting common features and trends. The empirical evaluation demonstrates that AI-powered automation enhances test execution efficiency and reduces maintenance effort but also exposes limitations such as handling complex UI changes and contextual understanding. Conclusion: AI-driven test automation tools show strong potential in improving software quality and reducing manual testing effort. However, their current limitations-such as false positives, lack of domain knowledge, and dependency on predefined models-indicate the need for further refinement. Future research should focus on advancing AI models to improve adaptability, reliability, and robustness in software testing.