🤖 AI Summary
This work addresses the performance bottleneck in fMRI brain network analysis caused by feature sparsity and insufficient incorporation of domain knowledge. To overcome this, the authors propose an efficient multimodal enhancement paradigm that leverages large language models (LLMs) without requiring full fine-tuning. The approach employs a three-stage pipeline: first, prompt engineering generates neuroscience-informed textual descriptions; second, lightweight instruction tuning produces semantically enriched representations, which are co-trained with a graph neural network (GNN) to achieve coarse-grained alignment; and third, task-specific adapters are fine-tuned for downstream applications. A novel multimodal alignment loss is introduced to strengthen GNN representation learning. Extensive experiments across multiple fMRI datasets demonstrate that the method significantly improves performance on brain network analysis tasks, marking the first successful integration of LLMs with brain graph data.
📝 Abstract
Graph Neural Networks (GNNs) have been widely used in diverse brain network analysis tasks based on preprocessed functional magnetic resonance imaging (fMRI) data. However, their performances are constrained due to high feature sparsity and inherent limitations of domain knowledge within uni-modal neurographs. Meanwhile, large language models (LLMs) have demonstrated powerful representation capabilities. Combining LLMs with GNNs presents a promising direction for brain network analysis. While LLMs and MLLMs have emerged in neuroscience, integration of LLMs with graph-based data remains unexplored. In this work, we deal with these issues by incorporating LLM's powerful representation and generalization capabilities. Considering great cost for directly tuning LLMs, we instead function LLM as enhancer to boost GNN's performance on downstream tasks. Our method, namely BLEG, can be divided into three stages. We firstly prompt LLM to get augmented texts for fMRI graph data, then we design a LLM-LM instruction tuning method to get enhanced textual representations at a relatively lower cost. GNN is trained together for coarsened alignment. Finally we finetune an adapter after GNN for given downstream tasks. Alignment loss between LM and GNN logits is designed to further enhance GNN's representation. Extensive experiments on different datasets confirmed BLEG's superiority.