π€ AI Summary
In UI-intensive applications, text-only requirements often fail to accurately elicit acceptance criteria (ACs). To address this, we propose RAGcceptance M2REβthe first retrieval-augmented generation framework tailored for multimodal requirements (text + UI screenshots). By synergistically integrating large language models with RAG, our method jointly models textual specifications and visual interface elements to precisely capture stakeholder intent and uncover critical conditions frequently overlooked by domain experts. Evaluated on an industrial-scale educational system serving 100,000 users, RAGcceptance M2RE-generated ACs significantly outperform baselines in relevance, correctness, and comprehensibility; manual effort is reduced by 62%. This work pioneers the application of RAG in multimodal requirements engineering, establishing a novel paradigm for automated, high-fidelity AC generation.
π Abstract
Acceptance criteria (ACs) play a critical role in software development by clearly defining the conditions under which a software feature satisfies stakeholder expectations. However, manually creating accurate, comprehensive, and unambiguous acceptance criteria is challenging, particularly in user interface-intensive applications, due to the reliance on domain-specific knowledge and visual context that is not always captured by textual requirements alone. To address these challenges, we propose RAGcceptance M2RE, a novel approach that leverages Retrieval-Augmented Generation (RAG) to generate acceptance criteria from multi-modal requirements data, including both textual documentation and visual UI information. We systematically evaluated our approach in an industrial case study involving an education-focused software system used by approximately 100,000 users. The results indicate that integrating multi-modal information significantly enhances the relevance, correctness, and comprehensibility of the generated ACs. Moreover, practitioner evaluations confirm that our approach effectively reduces manual effort, captures nuanced stakeholder intent, and provides valuable criteria that domain experts may overlook, demonstrating practical utility and significant potential for industry adoption. This research underscores the potential of multi-modal RAG techniques in streamlining software validation processes and improving development efficiency. We also make our implementation and a dataset available.