Evaluating GPT's Capability in Identifying Stages of Cognitive Impairment from Electronic Health Data

📅 2025-02-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the time-consuming, subjective, and non-scalable nature of manual electronic health record (EHR) review for staging cognitive impairment. We propose a zero-shot, automated approach leveraging GPT-4o—without fine-tuning or annotated training data—to interpret unstructured clinical notes and real-world, longitudinal insurance claims data (MGH Memory Clinic + Medicare), mapping them to Clinical Dementia Rating (CDR) scale scores and a three-class diagnosis (MCI, dementia, cognitively normal). Our key contribution is the first systematic validation of large language models’ high inter-rater agreement with clinical experts under zero-shot conditions: weighted Kappa = 0.83 for CDR staging on specialist notes; Kappa = 0.91 for three-class classification across 860 Medicare patients (vs. expert adjudication), rising to 0.96 on a high-confidence subset. This work overcomes the annotation dependency of conventional supervised NLP, establishing a scalable, efficient paradigm for large-scale cognitive impairment screening and research.

Technology Category

Application Category

📝 Abstract
Identifying cognitive impairment within electronic health records (EHRs) is crucial not only for timely diagnoses but also for facilitating research. Information about cognitive impairment often exists within unstructured clinician notes in EHRs, but manual chart reviews are both time-consuming and error-prone. To address this issue, our study evaluates an automated approach using zero-shot GPT-4o to determine stage of cognitive impairment in two different tasks. First, we evaluated the ability of GPT-4o to determine the global Clinical Dementia Rating (CDR) on specialist notes from 769 patients who visited the memory clinic at Massachusetts General Hospital (MGH), and achieved a weighted kappa score of 0.83. Second, we assessed GPT-4o's ability to differentiate between normal cognition, mild cognitive impairment (MCI), and dementia on all notes in a 3-year window from 860 Medicare patients. GPT-4o attained a weighted kappa score of 0.91 in comparison to specialist chart reviews and 0.96 on cases that the clinical adjudicators rated with high confidence. Our findings demonstrate GPT-4o's potential as a scalable chart review tool for creating research datasets and assisting diagnosis in clinical settings in the future.
Problem

Research questions and friction points this paper is trying to address.

Assesses GPT-4o's cognitive impairment staging from EHRs.
Evaluates GPT-4o for dementia rating on specialist notes.
Tests GPT-4o's differentiation of cognition levels in patients.
Innovation

Methods, ideas, or system contributions that make the work stand out.

GPT-4o for cognitive impairment
Zero-shot approach in EHRs
High kappa scores accuracy
🔎 Similar Papers
No similar papers found.
Yu Leng
Yu Leng
Massachusetts General Hospital, Harvard University
Y
Yingnan He
Massachusetts General Hospital, Harvard University
C
C. Magdamo
Massachusetts General Hospital, Harvard University
A
A.-M. Vranceanu
Massachusetts General Hospital, Harvard University
C
Christine S. Ritchie
Massachusetts General Hospital, Harvard University
S
Shibani S. Mukerji
Massachusetts General Hospital, Harvard University
L
L. M. Moura
Massachusetts General Hospital, Harvard University
J
John R Dickson
Massachusetts General Hospital, Harvard University
D
Deborah Blacker
Massachusetts General Hospital, Harvard University
Sudeshna Das
Sudeshna Das
Associate Prof. of Neurology Harvard Medical School
Bioinformatics