🤖 AI Summary
Clinical trial information is fragmented across registries (e.g., ClinicalTrials.gov) and scholarly literature (e.g., PubMed), impeding accessibility and evidence synthesis.
Method: We introduce the first interactive retrieval platform integrating registry data with full-text biomedical literature. Leveraging large language models (GPT-4, Gemini-1.5-Pro), it performs end-to-end structured extraction of trial information from PubMed articles, natural language query translation, and traceable question answering—while deeply aligning extracted entities with ClinicalTrials.gov metadata.
Contribution/Results: Compared to registry-only approaches, our platform increases coverage of structured trial data by 83.8%. Rigorous evaluation—including clinical expert assessment and automated metrics—demonstrates significant improvements in information completeness, accuracy, and usability. The system delivers trustworthy, evidence-based decision support for patients, clinicians, researchers, and policymakers.
📝 Abstract
We present ClinicalTrialsHub, an interactive search-focused platform that consolidates all data from ClinicalTrials.gov and augments it by automatically extracting and structuring trial-relevant information from PubMed research articles. Our system effectively increases access to structured clinical trial data by 83.8% compared to relying on ClinicalTrials.gov alone, with potential to make access easier for patients, clinicians, researchers, and policymakers, advancing evidence-based medicine. ClinicalTrialsHub uses large language models such as GPT-5.1 and Gemini-3-Pro to enhance accessibility. The platform automatically parses full-text research articles to extract structured trial information, translates user queries into structured database searches, and provides an attributed question-answering system that generates evidence-grounded answers linked to specific source sentences. We demonstrate its utility through a user study involving clinicians, clinical researchers, and PhD students of pharmaceutical sciences and nursing, and a systematic automatic evaluation of its information extraction and question answering capabilities.