🤖 AI Summary
This work addresses the limitations of existing software documentation, which often suffers from poor consistency, weak relevance, and unclear expression, while purely automated generation methods struggle with insufficient reliability and controllability. To overcome these challenges, the authors propose a human-AI collaborative approach that integrates empirically derived quality guidelines with large language model (LLM) assistance through a generate–evaluate–iterate workflow. This method preserves developers’ domain control while enhancing explanation quality by embedding experience-driven quality criteria directly into the LLM-assisted writing process, enabling controllable, efficient, and high-quality output. Preliminary experiments demonstrate a 24.4% average improvement in authoring efficiency with tool support, and user studies reveal significantly higher satisfaction with the generated explanations compared to purely manual writing (p = 0.003, effect size = 0.86).
📝 Abstract
As software systems increasingly rely on natural-language explanations to address user-reported explanation needs in requirements communication and support, ensuring that such explanations are consistent, relevant, and well formulated remains a major challenge. Purely automatic large language model (LLM) generation often lacks reliable grounding and controllable output quality. In this paper, we present a guideline-based formulation support tool for software explanations that combines LLM-assisted text generation with an empirically derived quality guideline. The tool structures the writing process into generation, quality checking, and iterative revision, while keeping domain control with developers. We evaluated the approach in a two-phase study consisting of an interview-based developer experiment and a controlled user survey. Six industry practitioners with software development or DevOps experience formulated explanations for real explanation needs in a human-only manual condition and in a human-with-LLM-support condition. In this small-scale evaluation, tool-supported formulation was on average 24.4% faster, although inferential analyses indicated only a trend for efficiency. In a subsequent user study with 17 participants and 204 paired comparisons, tool-supported explanations were rated significantly higher in overall satisfaction than manual explanations (p=0.003, rank-biserial correlation=0.86). Our findings suggest potential efficiency gains and higher perceived formulation quality through guideline-driven LLM assistance. Future work should examine long-term industrial use and integration into existing development workflows.