🤖 AI Summary
This study addresses the scalable speech-based screening need for Alzheimer’s disease and dementia by systematically investigating large language models (LLMs) for pure-text transcript analysis. We propose three key adaptation strategies: class-center–guided in-context example selection, inference-enhanced prompt engineering, and parameter-efficient fine-tuning coupled with classifier head augmentation. Through ablation and comparative experiments, we quantify their impact on detection performance and—novelly—demonstrate that a fine-tuned unimodal text model substantially outperforms multimodal audio-text fusion baselines. On the DementiaBank dataset, class-center–selected examples yield optimal in-context learning performance, while a lightweight augmented classifier head significantly boosts discriminative capability—especially for weaker base models. Our open-sourced optimized models achieve performance on par with commercial systems, establishing a new paradigm for low-cost, highly deployable cognitive impairment screening.
📝 Abstract
Over half of US adults with Alzheimer disease and related dementias remain undiagnosed, and speech-based screening offers a scalable detection approach. We compared large language model adaptation strategies for dementia detection using the DementiaBank speech corpus, evaluating nine text-only models and three multimodal audio-text models on recordings from DementiaBank speech corpus. Adaptations included in-context learning with different demonstration selection policies, reasoning-augmented prompting, parameter-efficient fine-tuning, and multimodal integration. Results showed that class-centroid demonstrations achieved the highest in-context learning performance, reasoning improved smaller models, and token-level fine-tuning generally produced the best scores. Adding a classification head substantially improved underperforming models. Among multimodal models, fine-tuned audio-text systems performed well but did not surpass the top text-only models. These findings highlight that model adaptation strategies, including demonstration selection, reasoning design, and tuning method, critically influence speech-based dementia detection, and that properly adapted open-weight models can match or exceed commercial systems.