GASCADE: Grouped Summarization of Adverse Drug Event for Enhanced Cancer Pharmacovigilance

📅 2025-05-07

🏛️ European Conference on Information Retrieval

📈 Citations: 0

✨ Influential: 0

career value

191K/year

🤖 AI Summary

This study addresses pharmacovigilance in oncology by introducing the novel task of “multi-patient adverse drug event (ADE) grouping summarization for anticancer drugs,” aiming to enhance pharmacoepidemiological decision-making and patient-centered understanding. Method: We propose a hybrid framework integrating large language model (LLM)-based information extraction with T5’s abstractive summarization capability. To our knowledge, this is the first application of Direct Preference Optimization (DPO) within an encoder-decoder architecture for medical summarization, augmented by synthetic data generation and multi-label structured annotation. Contribution/Results: We construct MCADRS—the first high-quality, fully annotated, multi-label ADE dataset for cancer therapeutics. Extensive experiments demonstrate that our method consistently outperforms all baselines in both automated metrics (e.g., ROUGE, BERTScore) and human evaluation, significantly improving clinical relevance, completeness, and readability of summaries. The code and MCADRS dataset are publicly released.

Technology Category

Application Category

📝 Abstract

In the realm of cancer treatment, summarizing adverse drug events (ADEs) reported by patients using prescribed drugs is crucial for enhancing pharmacovigilance practices and improving drug-related decision-making. While the volume and complexity of pharmacovigilance data have increased, existing research in this field has predominantly focused on general diseases rather than specifically addressing cancer. This work introduces the task of grouped summarization of adverse drug events reported by multiple patients using the same drug for cancer treatment. To address the challenge of limited resources in cancer pharmacovigilance, we present the MultiLabeled Cancer Adverse Drug Reaction and Summarization (MCADRS) dataset. This dataset includes pharmacovigilance posts detailing patient concerns regarding drug efficacy and adverse effects, along with extracted labels for drug names, adverse drug events, severity, and adversity of reactions, as well as summaries of ADEs for each drug. Additionally, we propose the Grouping and Abstractive Summarization of Cancer Adverse Drug events (GASCADE) framework, a novel pipeline that combines the information extraction capabilities of Large Language Models (LLMs) with the summarization power of the encoder-decoder T5 model. Our work is the first to apply alignment techniques, including advanced algorithms like Direct Preference Optimization, to encoder-decoder models using synthetic datasets for summarization tasks. Through extensive experiments, we demonstrate the superior performance of GASCADE across various metrics, validated through both automated assessments and human evaluations. This multitasking approach enhances drug-related decision-making and fosters a deeper understanding of patient concerns, paving the way for advancements in personalized and responsive cancer care. The code and dataset used in this work are publicly available.

Problem

Research questions and friction points this paper is trying to address.

Summarizing adverse drug events in cancer treatment for better pharmacovigilance

Addressing limited resources in cancer pharmacovigilance with a new dataset

Enhancing drug-related decision-making via grouped ADE summarization framework

Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces MCADRS dataset for cancer pharmacovigilance

Proposes GASCADE framework with LLMs and T5

Applies alignment techniques to encoder-decoder models

🔎 Similar Papers

No similar papers found.