Advancing Mobile UI Testing by Learning Screen Usage Semantics

📅 2025-05-15

📈 Citations: 0

✨ Influential: 0

career value

159K/year

🤖 AI Summary

Current mobile UI automation testing faces two critical bottlenecks: (1) AI-guided (AIG) tools struggle to navigate complex GUIs—such as login screens and ad splash pages—and (2) they lack semantic understanding of high-level user tasks (e.g., logging in, setting alarms), resulting in incomplete functional coverage and exacerbating accessibility barriers for elderly users. To address these issues, this paper introduces screen usage semantics into the AIG framework for the first time. Our approach integrates multimodal representation learning, models interface navigation logic via graph neural networks, and proposes a semantic-clustering–based method for functional coverage assessment—enabling a shift from widget-level interaction to user-task–level comprehension. Evaluated on 32 mainstream Android apps, our method improves test coverage by 37%, reduces dead-end scenarios (e.g., login pages) by 82%, and automatically identifies 127 UI design flaws adversely affecting elderly users’ accessibility.

Technology Category

Application Category

📝 Abstract

The demand for quality in mobile applications has increased greatly given users' high reliance on them for daily tasks. Developers work tirelessly to ensure that their applications are both functional and user-friendly. In pursuit of this, Automated Input Generation (AIG) tools have emerged as a promising solution for testing mobile applications by simulating user interactions and exploring app functionalities. However, these tools face significant challenges in navigating complex Graphical User Interfaces (GUIs), and developers often have trouble understanding their output. More specifically, AIG tools face difficulties in navigating out of certain screens, such as login pages and advertisements, due to a lack of contextual understanding which leads to suboptimal testing coverage. Furthermore, while AIG tools can provide interaction traces consisting of action and screen details, there is limited understanding of its coverage of higher level functionalities, such as logging in, setting alarms, or saving notes. Understanding these covered use cases are essential to ensure comprehensive test coverage of app functionalities. Difficulty in testing mobile UIs can lead to the design of complex interfaces, which can adversely affect users of advanced age who often face usability barriers due to small buttons, cluttered layouts, and unintuitive navigation. There exists many studies that highlight these issues, but automated solutions for improving UI accessibility needs more attention. This research seeks to enhance automated UI testing techniques by learning the screen usage semantics of mobile apps and helping them navigate more efficiently, offer more insights about tested functionalities and also improve the usability of a mobile app's interface by identifying and mitigating UI design issues.

Problem

Research questions and friction points this paper is trying to address.

Improving navigation of complex mobile GUIs in automated testing

Enhancing understanding of high-level functionality coverage in tests

Identifying and mitigating UI design issues for better usability

Innovation

Methods, ideas, or system contributions that make the work stand out.

Learning screen usage semantics for navigation

Enhancing automated input generation tools

Identifying and mitigating UI design issues

🔎 Similar Papers

Large Language Models for Mobile GUI Text Input Generation: An Empirical Study