π€ AI Summary
In safety-critical software compliance assessment, manual review suffers from low efficiency, inaccurate evidence citation, and weak reasoning robustness. To address these challenges, this paper proposes DRAFTβa novel framework featuring a dual-path collaborative retrieval-augmented paradigm that jointly retrieves software documentation and regulatory standards. We introduce a semi-automated data generation method incorporating distractors to realistically model expert cognitive load during evaluation. DRAFT integrates retrieval-augmented generation (RAG), supervised fine-tuning, and lightweight adaptation of GPT-4o-mini. Evaluated in highly regulated settings, DRAFT improves assessment accuracy by 7%, significantly enhancing evidence traceability, response structuring, and domain-specific reasoning stability. It establishes a reproducible, verifiable pathway for high-assurance AI-assisted compliance review.
π Abstract
Safety critical software assessment requires robust assessment against complex regulatory frameworks, a process traditionally limited by manual evaluation. This paper presents Document Retrieval-Augmented Fine-Tuning (DRAFT), a novel approach that enhances the capabilities of a large language model (LLM) for safety-critical compliance assessment. DRAFT builds upon existing Retrieval-Augmented Generation (RAG) techniques by introducing a novel fine-tuning framework that accommodates our dual-retrieval architecture, which simultaneously accesses both software documentation and applicable reference standards. To fine-tune DRAFT, we develop a semi-automated dataset generation methodology that incorporates variable numbers of relevant documents with meaningful distractors, closely mirroring real-world assessment scenarios. Experiments with GPT-4o-mini demonstrate a 7% improvement in correctness over the baseline model, with qualitative improvements in evidence handling, response structure, and domain-specific reasoning. DRAFT represents a practical approach to improving compliance assessment systems while maintaining the transparency and evidence-based reasoning essential in regulatory domains.