Early Accessibility: Automating Alt-Text Generation for UI Icons During App Development

📅 2025-04-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Mobile application UI icons frequently lack semantic alt-text, severely limiting accessibility for screen reader users. Method: This paper introduces the first development-phase alt-text generation method, integrating UI structural metadata (via DOM parsing), embedded text within icons (extracted via OCR), and contextual textual cues. A structured prompting framework jointly orchestrates a fine-tuned large language model and a multimodal model, enabling precise alt-text generation without requiring full-screen input. Contribution/Results: Unlike existing approaches reliant on extensive annotated datasets or post-deployment processing, our method avoids technical debt while significantly improving alt-text accuracy and contextual coherence. Experimental evaluation on real-world development scenarios demonstrates superior generation quality compared to state-of-the-art deep learning and vision-language models, confirming both effectiveness and engineering practicality.

Technology Category

Application Category

📝 Abstract
Alt-text is essential for mobile app accessibility, yet UI icons often lack meaningful descriptions, limiting accessibility for screen reader users. Existing approaches either require extensive labeled datasets, struggle with partial UI contexts, or operate post-development, increasing technical debt. We first conduct a formative study to determine when and how developers prefer to generate icon alt-text. We then explore the ALTICON approach for generating alt-text for UI icons during development using two fine-tuned models: a text-only large language model that processes extracted UI metadata and a multi-modal model that jointly analyzes icon images and textual context. To improve accuracy, the method extracts relevant UI information from the DOM tree, retrieves in-icon text via OCR, and applies structured prompts for alt-text generation. Our empirical evaluation with the most closely related deep-learning and vision-language models shows that ALTICON generates alt-text that is of higher quality while not requiring a full-screen input.
Problem

Research questions and friction points this paper is trying to address.

Automating alt-text generation for UI icons during app development
Addressing lack of meaningful descriptions for screen reader users
Reducing technical debt by avoiding post-development alt-text generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-tuned text-only and multi-modal models
Extracts UI info from DOM tree and OCR
Structured prompts for alt-text generation
🔎 Similar Papers
No similar papers found.