Android Instrumentation Testing in Continuous Integration: Practices, Patterns, and Performance

📅 2026-04-03

📈 Citations: 0

✨ Influential: 0

career value

159K/year

🤖 AI Summary

This study addresses the challenges of unstable end-to-end testing for Android applications in continuous integration (CI) due to fragile emulator configurations. It presents the first large-scale empirical analysis of 4,518 open-source projects, systematically examining how instrumentation tests are configured, how these practices evolve, and their comparative effectiveness in CI environments. Leveraging GitHub Actions metadata, the work evaluates three prevalent approaches: Gradle Managed Devices, community-reusable components, and custom scripts. Findings reveal that only 10.6% of projects adopt such testing; among them, community components demonstrate superior reliability and efficiency, third-party device labs are suitable for regression testing despite higher costs, and custom scripts, while flexible, suffer from high retry rates. The study thus illuminates current practices and critical trade-offs in Android CI testing.

Technology Category

Application Category

📝 Abstract

Android instrumentation tests (end-to-end tests that run on a device or emulator) can catch problems that simpler tests miss. However, running these tests automatically in continuous integration (CI) is often difficult because emulator setup is fragile and configurations tend to drift over time. We study how open-source Android apps run instrumentation tests in CI by analyzing 4,518 repositories that use CI (snapshot: Aug. 10, 2025). We examine CI workflow files, scripts, and build configurations to identify cases where device setup is defined in Gradle (e.g., Gradle Managed Devices). Our results answer three questions about adoption, evolution, and outcomes. First, only about one in ten repositories (481/4,518; 10.6%) run instrumentation tests in CI, typically using either reusable community components or repository-specific custom scripts to set up emulators. Second, these setups usually stay the same over time; when changes happen, projects tend to move from custom scripts toward reusable community components. Third, we study why projects change their CI setup by analyzing their commits, pull requests, and issue messages. We evaluate how different setup styles perform using GitHub Actions run- and step-level metadata (e.g., outcomes, duration, reruns, and queue delay). We find that teams often change approaches to expand test coverage, and that each approach fits different needs: community-based setups are typically the most reliable and efficient for everyday checks on new code, third-party device labs suit scheduled regression testing but can be costlier and fail more often, and custom scripting provides flexibility but is associated with more reruns.

Problem

Research questions and friction points this paper is trying to address.

Android instrumentation testing

continuous integration

emulator setup

CI configuration drift

test automation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Android instrumentation testing

continuous integration

empirical study