🤖 AI Summary
Addressing the dual challenges of scarce authentic speech samples and limited model interpretability in Alzheimer’s disease (AD) speech diagnosis, this paper proposes the Reverse Speech Localization (RSL) architecture. RSL introduces a novel three-level backward tracing paradigm—“most probable neuron → most probable speech token → most probable speech subtoken”—leveraging hierarchical gradient backpropagation and neuron activation probability modeling to extract AD-specific, interpretable speech biomarkers from pretrained large language models and generate highly discriminative synthetic speech samples. Experimental results demonstrate that RSL improves classification accuracy and F1-score by 3.5% and 3.2%, respectively, on AD speech diagnosis tasks, significantly outperforming mainstream interpretability methods such as SHAP and Integrated Gradients. By enabling biomarker discovery and data augmentation through mechanistic interpretability, RSL establishes a new paradigm for low-resource medical speech diagnostics.
📝 Abstract
This study introduces Reverse-Speech-Finder (RSF), a groundbreaking neural network backtracking architecture designed to enhance Alzheimer's Disease (AD) diagnosis through speech analysis. Leveraging the power of pre-trained large language models, RSF identifies and utilizes the most probable AD-specific speech markers, addressing both the scarcity of real AD speech samples and the challenge of limited interpretability in existing models. RSF's unique approach consists of three core innovations: Firstly, it exploits the observation that speech markers most probable of predicting AD, defined as the most probable speech-markers (MPMs), must have the highest probability of activating those neurons (in the neural network) with the highest probability of predicting AD, defined as the most probable neurons (MPNs). Secondly, it utilizes a speech token representation at the input layer, allowing backtracking from MPNs to identify the most probable speech-tokens (MPTs) of AD. Lastly, it develops an innovative backtracking method to track backwards from the MPNs to the input layer, identifying the MPTs and the corresponding MPMs, and ingeniously uncovering novel speech markers for AD detection. Experimental results demonstrate RSF's superiority over traditional methods such as SHAP and Integrated Gradients, achieving a 3.5% improvement in accuracy and a 3.2% boost in F1-score. By generating speech data that encapsulates novel markers, RSF not only mitigates the limitations of real data scarcity but also significantly enhances the robustness and accuracy of AD diagnostic models. These findings underscore RSF's potential as a transformative tool in speech-based AD detection, offering new insights into AD-related linguistic deficits and paving the way for more effective non-invasive early intervention strategies.