GASCADE: Grouped Summarization of Adverse Drug Event for Enhanced Cancer Pharmacovigilance

πŸ“… 2025-05-07
πŸ›οΈ European Conference on Information Retrieval
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This study addresses pharmacovigilance in oncology by introducing the novel task of β€œmulti-patient adverse drug event (ADE) grouping summarization for anticancer drugs,” aiming to enhance pharmacoepidemiological decision-making and patient-centered understanding. Method: We propose a hybrid framework integrating large language model (LLM)-based information extraction with T5’s abstractive summarization capability. To our knowledge, this is the first application of Direct Preference Optimization (DPO) within an encoder-decoder architecture for medical summarization, augmented by synthetic data generation and multi-label structured annotation. Contribution/Results: We construct MCADRSβ€”the first high-quality, fully annotated, multi-label ADE dataset for cancer therapeutics. Extensive experiments demonstrate that our method consistently outperforms all baselines in both automated metrics (e.g., ROUGE, BERTScore) and human evaluation, significantly improving clinical relevance, completeness, and readability of summaries. The code and MCADRS dataset are publicly released.

Technology Category

Application Category

πŸ“ Abstract
In the realm of cancer treatment, summarizing adverse drug events (ADEs) reported by patients using prescribed drugs is crucial for enhancing pharmacovigilance practices and improving drug-related decision-making. While the volume and complexity of pharmacovigilance data have increased, existing research in this field has predominantly focused on general diseases rather than specifically addressing cancer. This work introduces the task of grouped summarization of adverse drug events reported by multiple patients using the same drug for cancer treatment. To address the challenge of limited resources in cancer pharmacovigilance, we present the MultiLabeled Cancer Adverse Drug Reaction and Summarization (MCADRS) dataset. This dataset includes pharmacovigilance posts detailing patient concerns regarding drug efficacy and adverse effects, along with extracted labels for drug names, adverse drug events, severity, and adversity of reactions, as well as summaries of ADEs for each drug. Additionally, we propose the Grouping and Abstractive Summarization of Cancer Adverse Drug events (GASCADE) framework, a novel pipeline that combines the information extraction capabilities of Large Language Models (LLMs) with the summarization power of the encoder-decoder T5 model. Our work is the first to apply alignment techniques, including advanced algorithms like Direct Preference Optimization, to encoder-decoder models using synthetic datasets for summarization tasks. Through extensive experiments, we demonstrate the superior performance of GASCADE across various metrics, validated through both automated assessments and human evaluations. This multitasking approach enhances drug-related decision-making and fosters a deeper understanding of patient concerns, paving the way for advancements in personalized and responsive cancer care. The code and dataset used in this work are publicly available.
Problem

Research questions and friction points this paper is trying to address.

Summarizing adverse drug events in cancer treatment for better pharmacovigilance
Addressing limited resources in cancer pharmacovigilance with a new dataset
Enhancing drug-related decision-making via grouped ADE summarization framework
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces MCADRS dataset for cancer pharmacovigilance
Proposes GASCADE framework with LLMs and T5
Applies alignment techniques to encoder-decoder models
πŸ”Ž Similar Papers
No similar papers found.
Sofia Jamil
Sofia Jamil
PhD Research Scholar
Large Language ModelNatural Language ProcessingText to Image Generation Models
A
Aryan Dabad
Department of Computer Science & Engineering, Indian Institute of Technology Patna, India
B
Bollampalli Areen Reddy
Department of Computer Science & Engineering, Indian Institute of Technology Patna, India
S
Sriparna Saha
Department of Computer Science & Engineering, Indian Institute of Technology Patna, India
R
R. Misra
Department of Computer Science & Engineering, Indian Institute of Technology Patna, India
A
A. Shakur
Indira Gandhi Institute of Medical Sciences, Patna