Robust Speech and Natural Language Processing Models for Depression Screening

📅 2020-12-05
🏛️ IEEE Signal Processing in Medicine and Biology Symposium
📈 Citations: 5
Influential: 0
📄 PDF
🤖 AI Summary
To address the need for remote, contactless preliminary depression screening, this paper proposes a dual-modality transfer learning framework: separate deep models are built on acoustic features (MFCCs, prosody) and conversational text (BERT-based representations) for binary depression classification. The key contribution is the first validation of cross-speaker robustness on a large-scale, real-world human–machine dialogue dataset with over 10,000 annotated utterances—where test speakers are strictly disjoint from training speakers. Both models achieve AUC ≥ 0.80 on unseen speakers and diverse conversational scenarios, demonstrating strong generalization and practical deployability. This work establishes an interpretable, reproducible, and highly robust technical pathway for large-scale, population-level mental health screening.

Technology Category

Application Category

📝 Abstract
Depression is a global health concern with a critical need for increased patient screening. Speech technology offers advantages for remote screening but must perform robustly across patients. We have described two deep learning models developed for this purpose. One model is based on acoustics; the other is based on natural language processing. Both models employ transfer learning. Data from a depression-labeled corpus in which 11,000 unique users interacted with a human-machine application using conversational speech is used. Results on binary depression classification have shown that both models perform at or above AUC=0.80 on unseen data with no speaker overlap. Performance is further analyzed as a function of test subset characteristics, finding that the models are generally robust over speaker and session variables. We conclude that models based on these approaches offer promise for generalized automated depression screening.
Problem

Research questions and friction points this paper is trying to address.

Depression Detection
Language Processing
Speech Processing
Innovation

Methods, ideas, or system contributions that make the work stand out.

Deep Learning
Depression Detection
Transfer Learning
🔎 Similar Papers
No similar papers found.
Y
Y. Lu
Ellipsis Health, San Francisco, California, USA
A
A. Harati
Ellipsis Health, San Francisco, California, USA
T
T. Rutowski
Ellipsis Health, San Francisco, California, USA
R
R. Oliveira
Ellipsis Health, San Francisco, California, USA
P
P. Chlebek
Ellipsis Health, San Francisco, California, USA
Elizabeth Shriberg
Elizabeth Shriberg
Chief Science Officer, Ellipsis Health
conversational AIspeech technologyspeaker state detectionhealthcareaffective computing