CT-ADE: An Evaluation Benchmark for Adverse Drug Event Prediction from Clinical Trial Results

📅 2024-04-19

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

164K/year

🤖 AI Summary

Clinical trials suffer from insufficient accuracy in predicting adverse drug events (ADEs) for monotherapies. Method: We introduce CT-ADE, the first multi-label ADE prediction benchmark dataset tailored to monotherapy—comprising 2,497 drugs and 168,984 drug–ADE pairs—and systematically integrate patient-level and treatment-contextual features mapped to the MedDRA hierarchical ontology. Our approach combines structured clinical trial data extraction, multi-label classification modeling, and LLM-based zero-/few-shot evaluation, complemented by ablation studies. Contribution/Results: Incorporating patient and contextual features improves F1 scores by 21–38% over models relying solely on chemical structure or LLM specialization, confirming their critical predictive value. CT-ADE is fully open-sourced and reproducible, establishing a new AI-driven paradigm for drug safety assessment and providing foundational support for precise, proactive ADE risk prediction.

Technology Category

Application Category

📝 Abstract

Adverse drug events (ADEs) significantly impact clinical research, causing many clinical trial failures. ADE prediction is key for developing safer medications and enhancing patient outcomes. To support this effort, we introduce CT-ADE, a dataset for multilabel predictive modeling of ADEs in monopharmacy treatments. CT-ADE integrates data from 2,497 unique drugs, encompassing 168,984 drug-ADE pairs extracted from clinical trials, annotated with patient and contextual information, and comprehensive ADE concepts standardized across multiple levels of the MedDRA ontology. Preliminary analyses with large language models (LLMs) achieved F1-scores up to 55.90%. Models using patient and contextual information showed F1-score improvements of 21%-38% over models using only chemical structure data. Our results highlight the importance of target population and treatment regimens in the predictive modeling of ADEs, offering greater performance gains than LLM domain specialization and scaling. CT-ADE provides an essential tool for researchers aiming to leverage artificial intelligence and machine learning to enhance patient safety and minimize the impact of ADEs on pharmaceutical research and development. The dataset is publicly accessible at https://github.com/ds4dh/CT-ADE.

Problem

Research questions and friction points this paper is trying to address.

Predicting adverse drug events (ADEs) from clinical trial data.

Developing a dataset (CT-ADE) for multilabel ADE prediction.

Evaluating ADE prediction performance using large language models.

Innovation

Methods, ideas, or system contributions that make the work stand out.

CT-ADE dataset integrates treatment and population data

LLMs used for ADE prediction with 56% F1-score

Contextual information improves ADE prediction accuracy

🔎 Similar Papers

No similar papers found.

Pfizer

The annual base salary for this position ranges from $64,600.00 to $107,600.00. In addition, this position is eligible for participation in Pfizer’s Global Performance Plan with a bonus target of 7.5% of the base salary. We offer comprehensive and generous benefits and programs to help our colleagues lead healthy lives and to support each of life’s moments. Benefits offered include a 401(k) plan with Pfizer Matching Contributions and an additional Pfizer Retirement Savings Contribution, paid vacation, holiday and personal days, paid caregiver/parental and medical leave, and health benefits to include medical, prescription drug, dental and vision coverage. Learn more at Pfizer Candidate Site – U.S. Benefits | (uscandidates.mypfizerbenefits.com). Pfizer compensation structures and benefit packages are aligned based on the location of hire. The United States salary range provided does not apply to Tampa, FL or any location outside of the United States. Relocation assistance may be available based on business needs and/or eligibility.

Hybrid

Machine Learning Scientist, Scientific Reasoning Models, AI for Drug Discovery

Genentech

New York City, New York, United States of America / South San Francisco, California, United States of America

Research Scientist Intern, Multimodal AI (PhD)