Formal Methods Meets Readability: Auto-Documenting JML Java Code

📅 2025-06-10

📈 Citations: 0

✨ Influential: 0

career value

154K/year

🤖 AI Summary

This study investigates whether Java Modeling Language (JML) formal specifications enhance the quality of Javadoc generated by large language models (LLMs). To this end, we systematically compare documentation generation from JML-annotated versus non-annotated Java code, employing both automated metrics and expert human evaluation to assess impacts on completeness, accuracy, and specification coverage. Our key contribution is the first identification of a threshold effect for JML: increasing the density of class-level invariants yields progressively greater improvements in documentation completeness. Crucially, JML’s primary benefit lies in strengthening specification coverage—particularly for complex invariants and design contracts—rather than altering descriptive style. Experimental results show that JML significantly improves completeness of class-level Javadoc, with more modest gains at the method level; specification coverage increases markedly, while foundational descriptive quality (e.g., natural-language clarity or conciseness) remains largely unchanged.

Technology Category

Application Category

📝 Abstract

This paper investigates whether formal specifications using Java Modeling Language (JML) can enhance the quality of Large Language Model (LLM)-generated Javadocs. While LLMs excel at producing documentation from code alone, we hypothesize that incorporating formally verified invariants yields more complete and accurate results. We present a systematic comparison of documentation generated from JML-annotated and non-annotated Java classes, evaluating quality through both automated metrics and expert analysis. Our findings demonstrate that JML significantly improves class-level documentation completeness, with more moderate gains at the method level. Formal specifications prove particularly effective in capturing complex class invariants and design contracts that are frequently overlooked in code-only documentation. A threshold effect emerges, where the benefits of JML become more pronounced for classes with richer sets of invariants. While JML enhances specification coverage, its impact on core descriptive quality is limited, suggesting that formal specifications primarily ensure comprehensive coverage rather than fundamentally altering implementation descriptions. These results offer actionable insights for software teams adopting formal methods in documentation workflows, highlighting scenarios where JML provides clear advantages. The study contributes to AI-assisted software documentation research by demonstrating how formal methods and LLMs can synergistically improve documentation quality.

Problem

Research questions and friction points this paper is trying to address.

Enhancing LLM-generated Javadocs with JML specifications

Comparing documentation quality from JML vs non-JML Java code

Evaluating JML's impact on class and method-level documentation completeness

Innovation

Methods, ideas, or system contributions that make the work stand out.

JML enhances LLM-generated Javadocs completeness

Formal specs improve complex invariants coverage

JML and LLMs synergistically boost documentation

🔎 Similar Papers

SpecGen: Automated Generation of Formal Program Specifications via Large Language Models