From RAGs to riches: Utilizing large language models to write documents for clinical trials.

📅 2024-02-26

🏛️ Clinical Trials

📈 Citations: 2

✨ Influential: 0

🤖 AI Summary

To address low drafting efficiency, weak clinical reasoning, and insufficient regulatory compliance in clinical trial documents (e.g., protocols), this study proposes a synergistic framework integrating Retrieval-Augmented Generation (RAG) with commercial large language models (LLMs). Methodologically, we construct a knowledge base unifying structured data from ClinicalTrials.gov and authoritative regulatory guidelines (e.g., ICH-GCP), enabling precise semantic retrieval and controllable generation. Our key contribution is the first systematic validation that RAG significantly enhances LLM performance on critical dimensions—clinical reasoning and reference transparency—where two core metrics improve from ≈40% to ≈80%, while content relevance and terminology accuracy remain consistently above 80%. This framework overcomes the applicability limitations of purely generative models in high-stakes, rigor-critical medical documentation, substantially improving both the usability and regulatory compliance of protocol drafts.

Technology Category

Application Category

📝 Abstract

BACKGROUND/AIMS Clinical trials require numerous documents to be written: Protocols, consent forms, clinical study reports, and many others. Large language models offer the potential to rapidly generate first-draft versions of these documents; however, there are concerns about the quality of their output. Here, we report an evaluation of how good large language models are at generating sections of one such document, clinical trial protocols. METHODS Using an off-the-shelf large language model, we generated protocol sections for a broad range of diseases and clinical trial phases. Each of these document sections we assessed across four dimensions: Clinical thinking and logic; Transparency and references; Medical and clinical terminology; and Content relevance and suitability. To improve performance, we used the retrieval-augmented generation method to enhance the large language model with accurate up-to-date information, including regulatory guidance documents and data from ClinicalTrials.gov. Using this retrieval-augmented generation large language model, we regenerated the same protocol sections and assessed them across the same four dimensions. RESULTS We find that the off-the-shelf large language model delivers reasonable results, especially when assessing content relevance and the correct use of medical and clinical terminology, with scores of over 80%. However, the off-the-shelf large language model shows limited performance in clinical thinking and logic and transparency and references, with assessment scores of ≈40% or less. The use of retrieval-augmented generation substantially improves the writing quality of the large language model, with clinical thinking and logic and transparency and references scores increasing to ≈80%. The retrieval-augmented generation method thus greatly improves the practical usability of large language models for clinical trial-related writing. DISCUSSION Our results suggest that hybrid large language model architectures, such as the retrieval-augmented generation method we utilized, offer strong potential for clinical trial-related writing, including a wide variety of documents. This is potentially transformative, since it addresses several major bottlenecks of drug development.

Problem

Research questions and friction points this paper is trying to address.

Utilizing large language models for clinical trial documentation

Improving efficiency in writing clinical trial documents

Enhancing document quality using advanced language models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Utilizes large language models for document generation

Focuses on clinical trial documentation

Enhances efficiency in medical research processes

🔎 Similar Papers

No similar papers found.

Authors to Follow