Towards detecting the pathological subharmonic voicing with fully convolutional neural networks

📅 2025-01-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the clinical challenge of distinguishing pathological subharmonic phonation—such as low-frequency periodic perturbations induced by vocal fold lesions—from normal voice. We propose an end-to-end automatic detection method based on a fully convolutional neural network (FCN), the first to apply FCNs to subharmonic period analysis. Unlike conventional models relying on sequential modeling (e.g., RNNs or attention mechanisms), our approach leverages global receptive fields to directly model glottal cycle variability across entire utterances, enabling implicit learning of time–frequency features associated with subharmonic-induced periodic disturbances. Augmented with synthetic speech data, the model achieves 98.2% classification accuracy on synthetic datasets and demonstrates robust performance on real sustained vowel recordings. Its core innovation lies in replacing recurrent or attention-based architectures with a pure convolutional design, achieving high accuracy and strong generalization for subharmonic perception—indicating significant potential for clinical deployment.

Technology Category

Application Category

📝 Abstract
Many voice disorders induce subharmonic phonation, but voice signal analysis is currently lacking a technique to detect the presence of subharmonics reliably. Distinguishing subharmonic phonation from normal phonation is a challenging task as both are nearly periodic phenomena. Subharmonic phonation adds cyclical variations to the normal glottal cycles. Hence, the estimation of subharmonic period requires a wholistic analysis of the signals. Deep learning is an effective solution to this type of complex problem. This paper describes fully convolutional neural networks which are trained with synthesized subharmonic voice signals to classify the subharmonic periods. Synthetic evaluation shows over 98% classification accuracy, and assessment of sustained vowel recordings demonstrates encouraging outcomes as well as the areas for future improvements.
Problem

Research questions and friction points this paper is trying to address.

Abnormal Bass Identification
Disease-Related Voices
Human Voice Analysis
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fully Convolutional Neural Networks
Abnormal Bass Identification
High Precision Acoustic Recognition
🔎 Similar Papers
No similar papers found.
T
T. Ikuma
Department of Otolaryngology–Head and Neck Surgery, Louisiana State University Health Sciences Center, New Orleans, LA
Melda Kunduk
Melda Kunduk
Louisiana State University
B
Brad Story
Dept. of Speech, Language, Hearing Sciences, University of Arizona
A
Andrew J. McWhorter
Department of Otolaryngology–Head and Neck Surgery, Louisiana State University Health Sciences Center, New Orleans, LA