Recognition of Dysarthria in Amyotrophic Lateral Sclerosis patients using Hypernetworks

📅 2025-02-27

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

To address the low accuracy and poor generalizability of dysarthria recognition in amyotrophic lateral sclerosis (ALS) patients, this paper proposes the first hypernetwork-driven end-to-end speech analysis framework specifically designed for this task. Methodologically, the framework takes log-Mel spectrograms along with their first- and second-order derivatives (Δ/ΔΔ) as input and employs a lightweight hypernetwork to dynamically generate conditional weights for a fine-tuned AlexNet backbone, enabling input-adaptive parameter adaptation. This is the first application of hypernetworks to ALS-related speech pathology recognition, jointly optimizing parameter efficiency, cross-subject generalization, and robustness to acoustic noise. Evaluated on the public VOC-ALS dataset, the framework achieves 82.66% classification accuracy—significantly outperforming strong multimodal fusion baselines. Ablation studies confirm the critical contributions of both the hypernetwork mechanism and the spectrotemporal feature design.

Technology Category

Application Category

📝 Abstract

Amyotrophic Lateral Sclerosis (ALS) constitutes a progressive neurodegenerative disease with varying symptoms, including decline in speech intelligibility. Existing studies, which recognize dysarthria in ALS patients by predicting the clinical standard ALSFRS-R, rely on feature extraction strategies and the design of customized convolutional neural networks followed by dense layers. However, recent studies have shown that neural networks adopting the logic of input-conditional computations enjoy a series of benefits, including faster training, better performance, and flexibility. To resolve these issues, we present the first study incorporating hypernetworks for recognizing dysarthria. Specifically, we use audio files, convert them into log-Mel spectrogram, delta, and delta-delta, and pass the resulting image through a pretrained modified AlexNet model. Finally, we use a hypernetwork, which generates weights for a target network. Experiments are conducted on a newly collected publicly available dataset, namely VOC-ALS. Results showed that the proposed approach reaches Accuracy up to 82.66% outperforming strong baselines, including multimodal fusion methods, while findings from an ablation study demonstrated the effectiveness of the introduced methodology. Overall, our approach incorporating hypernetworks obtains valuable advantages over state-of-the-art results in terms of generalization ability, parameter efficiency, and robustness.

Problem

Research questions and friction points this paper is trying to address.

Recognizing dysarthria in ALS patients using hypernetworks

Improving speech intelligibility prediction in ALS patients

Enhancing neural network performance with hypernetworks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hypernetworks generate weights for target networks.

Log-Mel spectrogram, delta, delta-delta used for audio processing.

Pretrained modified AlexNet model enhances feature extraction.

🔎 Similar Papers

No similar papers found.

Authors to Follow