🤖 AI Summary
Subharmonic phonation in clinical voice analysis frequently causes erroneous fundamental frequency (F₀) estimation, undermining diagnostic reliability. Method: We constructed the first sustained-vowel subharmonic dataset, proposed the Subharmonic–Harmonic Ratio (SHR) as a quantitative metric for subharmonic intensity, and developed a quality-aware classification framework to identify subharmonic-induced F₀ estimation errors. We systematically evaluated the robustness of leading F₀ estimators—FCN-F₀, CREPE, Harvest, and Praat/YAAPT—under subharmonic conditions. Results: FCN-F₀ achieved the highest overall accuracy and superior subharmonic discrimination; CREPE and Harvest followed closely. SHR effectively exposed inter-algorithm differences in subharmonic sensitivity. This work establishes a novel benchmark and provides practical tools for objective assessment and algorithmic refinement in subharmonic voice analysis.
📝 Abstract
In clinical voice signal analysis, mishandling of subharmonic voicing may cause an acoustic parameter to signal false negatives. As such, the ability of a fundamental frequency estimator to identify speaking fundamental frequency is critical. This paper presents a sustained-vowel study, which used a quality-of-estimate classification to identify subharmonic errors and subharmonics-to-harmonics ratio (SHR) to measure the strength of subharmonic voicing. Five estimators were studied with a sustained vowel dataset: Praat, YAAPT, Harvest, CREPE, and FCN-F0. FCN-F0, a deep-learning model, performed the best both in overall accuracy and in correctly resolving subharmonic signals. CREPE and Harvest are also highly capable estimators for sustained vowel analysis.