Perceptual Implications of Automatic Anonymization in Pathological Speech

📅 2025-05-01

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Automatic speech anonymization lacks empirical validation of perceptual fidelity for pathological speech, hindering its ethical use in clinical data sharing. Method: We conducted a human–machine Turing-style auditory experiment involving five pathological groups (e.g., cleft lip/palate, articulation disorders) and healthy controls, employing state-of-the-art anonymization models (EER 30–40%). We assessed perception via zero-/few-shot identification, repeated-measures ANOVA, and cross-group comparisons using diverse listener backgrounds. Contribution/Results: While anonymized speech retained high intelligibility (91–93%), perceptual quality dropped significantly (83% → 59%, *p* < 0.001). Degradation varied across disorder types (*p* = 0.005), and the native listener advantage vanished post-anonymization. Crucially, automated privacy metrics showed no correlation with human perception, revealing pathology-specific degradation patterns. These findings provide the first empirical basis for establishing perceptually grounded standards for pathological speech anonymization.

Technology Category

Application Category

📝 Abstract

Automatic anonymization techniques are essential for ethical sharing of pathological speech data, yet their perceptual consequences remain understudied. This study presents the first comprehensive human-centered analysis of anonymized pathological speech, using a structured perceptual protocol involving ten native and non-native German listeners with diverse linguistic, clinical, and technical backgrounds. Listeners evaluated anonymized-original utterance pairs from 180 speakers spanning Cleft Lip and Palate, Dysarthria, Dysglossia, Dysphonia, and age-matched healthy controls. Speech was anonymized using state-of-the-art automatic methods (equal error rates in the range of 30-40%). Listeners completed Turing-style discrimination and quality rating tasks under zero-shot (single-exposure) and few-shot (repeated-exposure) conditions. Discrimination accuracy was high overall (91% zero-shot; 93% few-shot), but varied by disorder (repeated-measures ANOVA: p=0.007), ranging from 96% (Dysarthria) to 86% (Dysphonia). Anonymization consistently reduced perceived quality (from 83% to 59%, p<0.001), with pathology-specific degradation patterns (one-way ANOVA: p=0.005). Native listeners rated original speech slightly higher than non-native listeners (Delta=4%, p=0.199), but this difference nearly disappeared after anonymization (Delta=1%, p=0.724). No significant gender-based bias was observed. Critically, human perceptual outcomes did not correlate with automatic privacy or clinical utility metrics. These results underscore the need for listener-informed, disorder- and context-specific anonymization strategies that preserve privacy while maintaining interpretability, communicative functions, and diagnostic utility, especially for vulnerable populations such as children.

Problem

Research questions and friction points this paper is trying to address.

Evaluates perceptual effects of anonymizing pathological speech data

Assesses listener discrimination accuracy across various speech disorders

Examines quality degradation patterns in anonymized pathological speech

Innovation

Methods, ideas, or system contributions that make the work stand out.

Human-centered analysis of anonymized pathological speech

Turing-style discrimination and quality rating tasks

Listener-informed disorder-specific anonymization strategies

🔎 Similar Papers

No similar papers found.

Authors to Follow