Differential privacy enables fair and accurate AI-based analysis of speech disorders while protecting patient data

📅 2024-09-27

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This study addresses the challenge of simultaneously ensuring patient privacy, diagnostic accuracy, and fairness in AI-based pathological speech analysis. We introduce, for the first time, differential privacy (DP) into this domain, proposing a DP-SGD training framework integrated with gradient inversion attack evaluation, multilingual transfer learning (English → German → Spanish), and fairness auditing across gender and age dimensions. This enables quantitative modeling of the privacy–accuracy–fairness trade-off. Evaluated on 200 hours of real German pathological speech data, our method incurs only a 3.85% accuracy drop while effectively resisting voice reconstruction attacks. It generalizes robustly to Spanish Parkinson’s speech data via cross-lingual transfer. Under practical privacy budgets (ε ≤ 4), gender-related bias diminishes nearly to zero, and—critically—we identify and mitigate previously unreported age-associated fairness disparities in pathological speech classification.

Technology Category

Application Category

📝 Abstract

Speech pathology has impacts on communication abilities and quality of life. While deep learning-based models have shown potential in diagnosing these disorders, the use of sensitive data raises critical privacy concerns. Although differential privacy (DP) has been explored in the medical imaging domain, its application in pathological speech analysis remains largely unexplored despite the equally critical privacy concerns. This study is the first to investigate DP's impact on pathological speech data, focusing on the trade-offs between privacy, diagnostic accuracy, and fairness. Using a large, real-world dataset of 200 hours of recordings from 2,839 German-speaking participants, we observed a maximum accuracy reduction of 3.85% when training with DP with high privacy levels. To highlight real-world privacy risks, we demonstrated the vulnerability of non-private models to explicit gradient inversion attacks, reconstructing identifiable speech samples and showcasing DP's effectiveness in mitigating these risks. To generalize our findings across languages and disorders, we validated our approach on a dataset of Spanish-speaking Parkinson's disease patients, leveraging pretrained models from healthy English-speaking datasets, and demonstrated that careful pretraining on large-scale task-specific datasets can maintain favorable accuracy under DP constraints. A comprehensive fairness analysis revealed minimal gender bias at reasonable privacy levels but underscored the need for addressing age-related disparities. Our results establish that DP can balance privacy and utility in speech disorder detection, while highlighting unique challenges in privacy-fairness trade-offs for speech data. This provides a foundation for refining DP methodologies and improving fairness across diverse patient groups in real-world deployments.

Problem

Research questions and friction points this paper is trying to address.

Privacy Protection

Speech Disorder Diagnosis

Artificial Intelligence

Innovation

Methods, ideas, or system contributions that make the work stand out.

Differential Privacy

Speech Disorder Diagnosis

Fairness in Machine Learning

🔎 Similar Papers

No similar papers found.

Authors to Follow