🤖 AI Summary
Mobile application acceptance testing remains hindered by high generation and maintenance costs, especially in cross-platform frameworks like Flutter. This paper introduces AToMIC, the first framework to systematically apply customized large language models (LLMs) for end-to-end automated generation of industrial-grade mobile acceptance tests. Given JIRA requirements and code changes, AToMIC jointly generates Gherkin scenarios, Page Object classes, and executable Flutter UI test scripts. Its core innovation lies in integrating requirement semantics understanding, code-difference awareness, and domain-specific syntactic constraints to produce high-fidelity, maintainable test artifacts. Evaluated on 13 real-world features of the BMW MyBMW app, AToMIC achieves an average generation time of five minutes per test, 93.3% Gherkin syntax correctness, 78.8% Page Object usability without modification, and 100% UI test execution success rate—demonstrating substantial improvements in test development efficiency and agility.
📝 Abstract
Mobile acceptance testing remains a bottleneck in modern software development, particularly for cross-platform mobile development using frameworks like Flutter. While developers increasingly rely on automated testing tools, creating and maintaining acceptance test artifacts still demands significant manual effort. To help tackle this issue, we introduce AToMIC, an automated framework leveraging specialized Large Language Models to generate Gherkin scenarios, Page Objects, and executable UI test scripts directly from requirements (JIRA tickets) and recent code changes. Applied to BMW's MyBMW app, covering 13 real-world issues in a 170+ screen codebase, AToMIC produced executable test artifacts in under five minutes per feature on standard hardware. The generated artifacts were of high quality: 93.3% of Gherkin scenarios were syntactically correct upon generation, 78.8% of PageObjects ran without manual edits, and 100% of generated UI tests executed successfully. In a survey, all practitioners reported time savings (often a full developer-day per feature) and strong confidence in adopting the approach. These results confirm AToMIC as a scalable, practical solution for streamlining acceptance test creation and maintenance in industrial mobile projects.