InformGen: An AI Copilot for Accurate and Compliant Clinical Research Consent Document Generation

📅 2025-04-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge of simultaneously ensuring regulatory compliance and factual accuracy in generating informed consent forms (ICFs) for high-risk clinical trials. Methodologically, it introduces the first AI collaborator framework for ICF generation by: (1) explicitly encoding 18 core FDA compliance rules; (2) constructing the first benchmark dataset of 900 protocol–ICF paired instances; and (3) integrating protocol parsing, rule-guided generation, inline source citation, and human-in-the-loop feedback. Its key contribution is a traceable, verifiable compliance-aware generation paradigm. Experiments demonstrate that the framework achieves 99.7% core compliance—30 percentage points higher than GPT-4o—and attains >90% factual accuracy in expert human evaluation, significantly outperforming mainstream baselines (57%–82%). This work establishes a novel, trustworthy AI generation paradigm for high-stakes medical documentation.

Technology Category

Application Category

📝 Abstract
Leveraging large language models (LLMs) to generate high-stakes documents, such as informed consent forms (ICFs), remains a significant challenge due to the extreme need for regulatory compliance and factual accuracy. Here, we present InformGen, an LLM-driven copilot for accurate and compliant ICF drafting by optimized knowledge document parsing and content generation, with humans in the loop. We further construct a benchmark dataset comprising protocols and ICFs from 900 clinical trials. Experimental results demonstrate that InformGen achieves near 100% compliance with 18 core regulatory rules derived from FDA guidelines, outperforming a vanilla GPT-4o model by up to 30%. Additionally, a user study with five annotators shows that InformGen, when integrated with manual intervention, attains over 90% factual accuracy, significantly surpassing the vanilla GPT-4o model's 57%-82%. Crucially, InformGen ensures traceability by providing inline citations to source protocols, enabling easy verification and maintaining the highest standards of factual integrity.
Problem

Research questions and friction points this paper is trying to address.

Ensures regulatory compliance in clinical consent documents
Improves factual accuracy in AI-generated medical forms
Provides traceable citations for source protocol verification
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-driven copilot for compliant ICF drafting
Optimized knowledge parsing and content generation
Ensures traceability with inline source citations
🔎 Similar Papers
2024-09-272024 6th International Conference on Artificial Intelligence and Computer Applications (ICAICA)Citations: 6
Z
Zifeng Wang
School of Computing and Data Science, University of Illinois Urbana-Champaign, Urbana, IL, USA
Junyi Gao
Junyi Gao
University of Edinburgh
Data MiningAI for healthcare
B
Benjamin Danek
School of Computing and Data Science, University of Illinois Urbana-Champaign, Urbana, IL, USA
Brandon Theodorou
Brandon Theodorou
PhD Student, University of Illinois at Urbana-Champaign
Deep Learning in HealthcareGenerative Models
R
Ruba Shaik
Carle Illinois College of Medicine, University of Illinois Urbana-Champaign, Champaign, IL, USA
S
Shivashankar Thati
School of Computing and Data Science, University of Illinois Urbana-Champaign, Urbana, IL, USA
S
Seunghyun Won
Seoul National University Bundang Hospital, Gyeonggi, Republic of Korea
Jimeng Sun
Jimeng Sun
Professor at University of Illinois Urbana-Champaign
AI for healthcareMachine learning for healthcaredeep learning for healthcare