Document Retrieval Augmented Fine-Tuning (DRAFT) for safety-critical software assessments

📅 2025-05-02

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

In safety-critical software compliance assessment, manual review suffers from low efficiency, inaccurate evidence citation, and weak reasoning robustness. To address these challenges, this paper proposes DRAFT—a novel framework featuring a dual-path collaborative retrieval-augmented paradigm that jointly retrieves software documentation and regulatory standards. We introduce a semi-automated data generation method incorporating distractors to realistically model expert cognitive load during evaluation. DRAFT integrates retrieval-augmented generation (RAG), supervised fine-tuning, and lightweight adaptation of GPT-4o-mini. Evaluated in highly regulated settings, DRAFT improves assessment accuracy by 7%, significantly enhancing evidence traceability, response structuring, and domain-specific reasoning stability. It establishes a reproducible, verifiable pathway for high-assurance AI-assisted compliance review.

Technology Category

Application Category

📝 Abstract

Safety critical software assessment requires robust assessment against complex regulatory frameworks, a process traditionally limited by manual evaluation. This paper presents Document Retrieval-Augmented Fine-Tuning (DRAFT), a novel approach that enhances the capabilities of a large language model (LLM) for safety-critical compliance assessment. DRAFT builds upon existing Retrieval-Augmented Generation (RAG) techniques by introducing a novel fine-tuning framework that accommodates our dual-retrieval architecture, which simultaneously accesses both software documentation and applicable reference standards. To fine-tune DRAFT, we develop a semi-automated dataset generation methodology that incorporates variable numbers of relevant documents with meaningful distractors, closely mirroring real-world assessment scenarios. Experiments with GPT-4o-mini demonstrate a 7% improvement in correctness over the baseline model, with qualitative improvements in evidence handling, response structure, and domain-specific reasoning. DRAFT represents a practical approach to improving compliance assessment systems while maintaining the transparency and evidence-based reasoning essential in regulatory domains.

Problem

Research questions and friction points this paper is trying to address.

Enhancing LLM capabilities for safety-critical compliance assessment

Automating evaluation against complex regulatory frameworks

Improving evidence handling and domain-specific reasoning accuracy

Innovation

Methods, ideas, or system contributions that make the work stand out.

DRAFT combines dual-retrieval with fine-tuning for compliance.

Semi-automated dataset generation enhances real-world assessment accuracy.

Improves correctness and reasoning in regulatory compliance assessments.

🔎 Similar Papers

A Systematic Literature Review on Explainability for Machine/Deep Learning-based Software Engineering Research