Identifying social isolation themes in NVDRS text narratives using topic modeling and text-classification methods

📅 2025-06-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the public health challenge of monitoring social isolation—particularly hindered by missing structured fields in the National Violent Death Reporting System (NVDRS). We propose a two-stage NLP framework: first, domain-specific lexicon construction via LDA topic modeling; second, hybrid classification integrating fine-tuned BERT with interpretable logistic regression. Applied to over 300,000 suicide narrative texts, it is the first systematic identification and quantification of chronic versus situational social isolation. The model achieves an F1-score of 0.86 and accuracy of 0.82, identifying 1,198 chronic isolation cases. Significant associations are found with gender, sexual orientation, and marital status. Our approach overcomes the bottleneck of extracting latent social determinants from unstructured death investigation narratives, offering a scalable, interpretable technical pathway for proactive surveillance and precision intervention targeting loneliness-related suicide risk.

Technology Category

Application Category

📝 Abstract
Social isolation and loneliness, which have been increasing in recent years strongly contribute toward suicide rates. Although social isolation and loneliness are not currently recorded within the US National Violent Death Reporting System's (NVDRS) structured variables, natural language processing (NLP) techniques can be used to identify these constructs in law enforcement and coroner medical examiner narratives. Using topic modeling to generate lexicon development and supervised learning classifiers, we developed high-quality classifiers (average F1: .86, accuracy: .82). Evaluating over 300,000 suicides from 2002 to 2020, we identified 1,198 mentioning chronic social isolation. Decedents had higher odds of chronic social isolation classification if they were men (OR = 1.44; CI: 1.24, 1.69, p<.0001), gay (OR = 3.68; 1.97, 6.33, p<.0001), or were divorced (OR = 3.34; 2.68, 4.19, p<.0001). We found significant predictors for other social isolation topics of recent or impending divorce, child custody loss, eviction or recent move, and break-up. Our methods can improve surveillance and prevention of social isolation and loneliness in the United States.
Problem

Research questions and friction points this paper is trying to address.

Detect social isolation in NVDRS narratives using NLP
Develop classifiers to identify chronic social isolation cases
Analyze predictors like gender and divorce for isolation risks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Using topic modeling for lexicon development
Applying supervised learning classifiers for identification
Analyzing NVDRS narratives with NLP techniques
🔎 Similar Papers
No similar papers found.
D
Drew Walker
Department of Biomedical Informatics, School of Medicine, Emory University, Atlanta GA, USA
S
Swati Rajwal
Department of Computer Science, Emory College of Arts and Sciences, Emory University, Atlanta, GA, USA
Sudeshna Das
Sudeshna Das
Associate Prof. of Neurology Harvard Medical School
Bioinformatics
S
Snigdha Peddireddy
Department of Behavioral, Social, Health Education Sciences, Rollins School of Public Health, Emory University, Atlanta GA, USA
Abeed Sarker
Abeed Sarker
Emory University School of Medicine
Natural Language ProcessingBiomedical InformaticsHealth Data ScienceApplied Machine Learning