Cross-Enhanced Multimodal Fusion of Eye-Tracking and Facial Features for Alzheimer's Disease Diagnosis

📅 2025-10-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenge of early, accurate diagnosis of Alzheimer’s disease (AD), this paper proposes, for the first time, a cross-modal collaborative enhancement framework integrating eye-tracking trajectories and facial micro-expressions. Methodologically, we design a Cross-Modal Fusion Attention Module (CEFAM) and a Direction-Aware Convolutional Module (DACM), which explicitly model inter-modal interactions and modality-specific contributions via cross-attention mechanisms, global feature enhancement, and horizontal-vertical receptive field modeling—enabling adaptive discriminative representation learning. Evaluated on a self-collected multimodal dataset acquired under a visual memory search paradigm, our framework achieves 95.11% classification accuracy, significantly outperforming conventional fusion approaches. This work establishes a novel, non-invasive, objective, and scalable paradigm for AD auxiliary diagnosis and provides key technical foundations for clinical translation.

Technology Category

Application Category

📝 Abstract
Accurate diagnosis of Alzheimer's disease (AD) is essential for enabling timely intervention and slowing disease progression. Multimodal diagnostic approaches offer considerable promise by integrating complementary information across behavioral and perceptual domains. Eye-tracking and facial features, in particular, are important indicators of cognitive function, reflecting attentional distribution and neurocognitive state. However, few studies have explored their joint integration for auxiliary AD diagnosis. In this study, we propose a multimodal cross-enhanced fusion framework that synergistically leverages eye-tracking and facial features for AD detection. The framework incorporates two key modules: (a) a Cross-Enhanced Fusion Attention Module (CEFAM), which models inter-modal interactions through cross-attention and global enhancement, and (b) a Direction-Aware Convolution Module (DACM), which captures fine-grained directional facial features via horizontal-vertical receptive fields. Together, these modules enable adaptive and discriminative multimodal representation learning. To support this work, we constructed a synchronized multimodal dataset, including 25 patients with AD and 25 healthy controls (HC), by recording aligned facial video and eye-tracking sequences during a visual memory-search paradigm, providing an ecologically valid resource for evaluating integration strategies. Extensive experiments on this dataset demonstrate that our framework outperforms traditional late fusion and feature concatenation methods, achieving a classification accuracy of 95.11% in distinguishing AD from HC, highlighting superior robustness and diagnostic performance by explicitly modeling inter-modal dependencies and modality-specific contributions.
Problem

Research questions and friction points this paper is trying to address.

Integrating eye-tracking and facial features for Alzheimer's diagnosis
Modeling inter-modal interactions through cross-attention mechanisms
Capturing directional facial features with specialized convolution modules
Innovation

Methods, ideas, or system contributions that make the work stand out.

Cross-enhanced fusion integrates eye-tracking and facial features
Direction-aware convolution captures fine-grained facial directional patterns
Cross-attention models inter-modal interactions for adaptive representation learning
Y
Yujie Nie
School of Control Science and Engineering, Shandong University, Jinan, 250061, China
J
Jianzhang Ni
Department of Psychiatry, The Chinese University of Hong Kong, Hong Kong, 999077, SAR, China
Y
Yonglong Ye
Department of Psychiatry, The Chinese University of Hong Kong, Hong Kong, 999077, SAR, China
Y
Yuan-Ting Zhang
Department of Electronic Engineering, The Chinese University of Hong Kong, Hong Kong, 999077, SAR, China
Y
Yun Kwok Wing
Department of Psychiatry, The Chinese University of Hong Kong, Hong Kong, 999077, SAR, China
X
Xiangqing Xu
Department of Neurology, Shandong University of Traditional Chinese Medicine Affiliated Hospital, Jinan, 16369, China
X
Xin Ma
School of Control Science and Engineering, Shandong University, Jinan, 250061, China
Lizhou Fan
Lizhou Fan
Vice-Chancellor Assistant Professor, The Chinese University of Hong Kong
Medical AIHealth InformaticsAI AgentsAI for SciencePsychiatry