Adverse Event Extraction from Discharge Summaries: A New Dataset, Annotation Scheme, and Initial Findings

📅 2025-06-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the under-resourced yet clinically critical task of adverse event (AE) identification in discharge summaries for elderly patients. We introduce the first fine-grained, manually annotated corpus specifically designed for this population, covering 14 AE types and attributes including negation, diagnosis type, and in-hospital occurrence, with support for discontinuous and overlapping entity annotations. Methodologically, we propose a novel three-tier evaluation framework—span-level (fine-grained entities), category-level (coarse-grained AE types), and negation-aware detection—and implement sequence labeling and document classification using FlairNLP and BERT-cased. Experiments show strong document-level AE detection performance (F1 = 0.943), but significantly lower span-level F1 (0.675), revealing persistent challenges in rare-AE recognition and modeling complex clinical language. This work fills a critical gap in geriatric NLP resources and establishes a new benchmark and methodological paradigm for fine-grained clinical event extraction.

Technology Category

Application Category

📝 Abstract
In this work, we present a manually annotated corpus for Adverse Event (AE) extraction from discharge summaries of elderly patients, a population often underrepresented in clinical NLP resources. The dataset includes 14 clinically significant AEs-such as falls, delirium, and intracranial haemorrhage, along with contextual attributes like negation, diagnosis type, and in-hospital occurrence. Uniquely, the annotation schema supports both discontinuous and overlapping entities, addressing challenges rarely tackled in prior work. We evaluate multiple models using FlairNLP across three annotation granularities: fine-grained, coarse-grained, and coarse-grained with negation. While transformer-based models (e.g., BERT-cased) achieve strong performance on document-level coarse-grained extraction (F1 = 0.943), performance drops notably for fine-grained entity-level tasks (e.g., F1 = 0.675), particularly for rare events and complex attributes. These results demonstrate that despite high-level scores, significant challenges remain in detecting underrepresented AEs and capturing nuanced clinical language. Developed within a Trusted Research Environment (TRE), the dataset is available upon request via DataLoch and serves as a robust benchmark for evaluating AE extraction methods and supporting future cross-dataset generalisation.
Problem

Research questions and friction points this paper is trying to address.

Extracting adverse events from elderly patient discharge summaries
Addressing discontinuous and overlapping entity annotation challenges
Evaluating model performance on rare events and complex attributes
Innovation

Methods, ideas, or system contributions that make the work stand out.

Manually annotated corpus for AE extraction
Annotation schema supports complex entities
Transformer models evaluated on multiple granularities
🔎 Similar Papers
No similar papers found.