CT-ADE: An Evaluation Benchmark for Adverse Drug Event Prediction from Clinical Trial Results

📅 2024-04-19
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
Clinical trials suffer from insufficient accuracy in predicting adverse drug events (ADEs) for monotherapies. Method: We introduce CT-ADE, the first multi-label ADE prediction benchmark dataset tailored to monotherapy—comprising 2,497 drugs and 168,984 drug–ADE pairs—and systematically integrate patient-level and treatment-contextual features mapped to the MedDRA hierarchical ontology. Our approach combines structured clinical trial data extraction, multi-label classification modeling, and LLM-based zero-/few-shot evaluation, complemented by ablation studies. Contribution/Results: Incorporating patient and contextual features improves F1 scores by 21–38% over models relying solely on chemical structure or LLM specialization, confirming their critical predictive value. CT-ADE is fully open-sourced and reproducible, establishing a new AI-driven paradigm for drug safety assessment and providing foundational support for precise, proactive ADE risk prediction.

Technology Category

Application Category

📝 Abstract
Adverse drug events (ADEs) significantly impact clinical research, causing many clinical trial failures. ADE prediction is key for developing safer medications and enhancing patient outcomes. To support this effort, we introduce CT-ADE, a dataset for multilabel predictive modeling of ADEs in monopharmacy treatments. CT-ADE integrates data from 2,497 unique drugs, encompassing 168,984 drug-ADE pairs extracted from clinical trials, annotated with patient and contextual information, and comprehensive ADE concepts standardized across multiple levels of the MedDRA ontology. Preliminary analyses with large language models (LLMs) achieved F1-scores up to 55.90%. Models using patient and contextual information showed F1-score improvements of 21%-38% over models using only chemical structure data. Our results highlight the importance of target population and treatment regimens in the predictive modeling of ADEs, offering greater performance gains than LLM domain specialization and scaling. CT-ADE provides an essential tool for researchers aiming to leverage artificial intelligence and machine learning to enhance patient safety and minimize the impact of ADEs on pharmaceutical research and development. The dataset is publicly accessible at https://github.com/ds4dh/CT-ADE.
Problem

Research questions and friction points this paper is trying to address.

Predicting adverse drug events (ADEs) from clinical trial data.
Developing a dataset (CT-ADE) for multilabel ADE prediction.
Evaluating ADE prediction performance using large language models.
Innovation

Methods, ideas, or system contributions that make the work stand out.

CT-ADE dataset integrates treatment and population data
LLMs used for ADE prediction with 56% F1-score
Contextual information improves ADE prediction accuracy
🔎 Similar Papers
No similar papers found.
Anthony Yazdani
Anthony Yazdani
PhD student, University of Geneva
Deep learningMachine learningNatural language processingDigital Health
A
Alban Bornet
Department of Radiology and Medical Informatics, Faculty of Medicine, University of Geneva, Geneva, Switzerland
Boya Zhang
Boya Zhang
Lawrence Livermore National Laboratory
Design of ExperimentsGaussian processesActive learning
P
Philipp Khlebnikov
Risklick AG, Bern, Switzerland
P
Poorya Amini
Risklick AG, Bern, Switzerland
Douglas Teodoro
Douglas Teodoro
Professor, University of Geneva
biomedical NLPmachine learning for healthcaremedical informatics