A11YN: aligning LLMs for accessible web UI code generation

📅 2025-10-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language models (LLMs) often generate web UI code inheriting accessibility defects from training data, failing to meet the diverse needs of users with disabilities. Method: We propose the first accessibility-aligned code generation framework, comprising (i) UIReq-6.8K—a novel instruction-tuning dataset with 6,800 accessibility-focused UI specifications—and RealUIReq-300, a benchmark of 300 real-world accessibility requests; and (ii) a fine-grained reward function grounded in WCAG guidelines and integrated with an automated accessibility detection engine, enabling severity-aware penalization via reinforcement learning. Contribution/Results: Our approach reduces inaccessibility rates by 60% while preserving semantic correctness and visual fidelity. It is the first work to systematically demonstrate the feasibility and effectiveness of aligning LLM-generated UI code with accessibility standards—establishing a foundational methodology for inclusive interface generation.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) have recently demonstrated strong capabilities in generating functional and aesthetic web interfaces directly from instructions. However, these models often replicate accessibility flaws from their training data, resulting in interfaces that exclude users with diverse needs and contexts. To address this gap, we introduce A11yn, the first method that aligns code-generating LLMs to reliably produce accessibility-compliant web UIs. A11yn optimizes a novel reward function that penalizes violations of the Web Content Accessibility Guidelines (WCAG), with penalties scaled to the severity of each violation as identified by an accessibility testing engine. To support training, we construct UIReq-6.8K, a dataset of 6,800 diverse instructions for web UI generation. For evaluation, we introduce RealUIReq-300, a benchmark of 300 real-world web UI requests grounded and manually curated from public web pages, spanning a broad range of use cases. Empirical results show that A11yn significantly outperforms strong baselines, lowering the Inaccessibility Rate by 60% over the base model while preserving semantic fidelity and visual quality of generated UIs. These findings demonstrate that accessibility can be systematically optimized within LLMs, showing the feasibility of aligning code generation for accessibility.
Problem

Research questions and friction points this paper is trying to address.

Aligning LLMs to generate accessible web UI code compliant with WCAG guidelines
Addressing accessibility flaws in LLM-generated interfaces that exclude diverse users
Optimizing reward functions to penalize WCAG violations based on severity levels
Innovation

Methods, ideas, or system contributions that make the work stand out.

Aligns LLMs to generate accessibility-compliant web UIs
Optimizes reward function penalizing WCAG violations severity
Lowers Inaccessibility Rate by 60% while preserving quality
J
Janghan Yoon
Yonsei University
J
Jaegwan Cho
Yonsei University
J
Junhyeok Kim
Yonsei University
Jiwan Chung
Jiwan Chung
Yonsei University
Computer VisionNLPMultimodal Learning
J
Jaehyun Jeon
Yonsei University
Y
Youngjae Yu
Seoul National University