Comparative Evaluation of Acoustic Feature Extraction Tools for Clinical Speech Analysis

📅 2025-06-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Clinical voice analysis lacks cross-platform comparability in acoustic feature extraction, hindering reproducibility of voice biomarkers. Method: This study systematically evaluates feature consistency and discriminative performance of OpenSMILE, Praat, and Librosa on speech from individuals with schizophrenia-spectrum disorders and healthy controls, using standardized parameter configurations. Pearson correlation analysis and AUC-based binary classification assess inter-tool agreement and clinical utility. Results: F0 percentiles exhibit high cross-tool consistency (r = 0.962–0.999), whereas F0 standard deviation and formant frequencies show low or negative correlations; correlation patterns diverge significantly between patient and control groups. F0 mean, harmonic-to-noise ratio (HNR), and MFCC1 achieve AUC > 0.70 in disorder classification. The study proposes a multi-tool cross-validation framework and transparent reporting standards to enhance methodological rigor and reproducibility in clinical voice biomarker research.

Technology Category

Application Category

📝 Abstract
This study compares three acoustic feature extraction toolkits (OpenSMILE, Praat, and Librosa) applied to clinical speech data from individuals with schizophrenia spectrum disorders (SSD) and healthy controls (HC). By standardizing extraction parameters across the toolkits, we analyzed speech samples from 77 SSD and 87 HC participants and found significant toolkit-dependent variations. While F0 percentiles showed high cross-toolkit correlation (r=0.962 to 0.999), measures like F0 standard deviation and formant values often had poor, even negative, agreement. Additionally, correlation patterns differed between SSD and HC groups. Classification analysis identified F0 mean, HNR, and MFCC1 (AUC greater than 0.70) as promising discriminators. These findings underscore reproducibility concerns and advocate for standardized protocols, multi-toolkit cross-validation, and transparent reporting.
Problem

Research questions and friction points this paper is trying to address.

Compare acoustic feature extraction tools for clinical speech analysis
Evaluate toolkit-dependent variations in speech data from SSD and HC groups
Identify promising speech features for discriminating SSD and HC
Innovation

Methods, ideas, or system contributions that make the work stand out.

Standardized extraction parameters across toolkits
Identified promising discriminators for classification
Advocated multi-toolkit cross-validation methods
🔎 Similar Papers
No similar papers found.