🤖 AI Summary
This study addresses the challenges of low data efficiency and cross-task decoding in magnetoencephalography (MEG)-based speech brain–computer interfaces by proposing a Conformer-based transfer learning framework. The model is pretrained on large-scale auditory data and fine-tuned with only five minutes of individual-specific data, enabling effective cross-task transfer between speech perception and production tasks. This work provides the first empirical evidence for shared neural representations between these two task types, overcoming the conventional reliance on task-specific motor signals. Experimental results demonstrate performance improvements of 1–4% in within-task decoding accuracy and 5–6% in cross-task scenarios. Notably, models trained on speech production data successfully decode passive auditory perception significantly above chance level, highlighting the framework’s robust generalization capability across distinct cognitive tasks.
📝 Abstract
Data-efficient neural decoding is a central challenge for speech brain-computer interfaces. We present the first demonstration of transfer learning and cross-task decoding for MEG-based speech models spanning perception and production. We pre-train a Conformer-based model on 50 hours of single-subject listening data and fine-tune on just 5 minutes per subject across 18 participants. Transfer learning yields consistent improvements, with in-task accuracy gains of 1-4% and larger cross-task gains of up to 5-6%. Not only does pre-training improve performance within each task, but it also enables reliable cross-task decoding between perception and production. Critically, models trained on speech production decode passive listening above chance, confirming that learned representations reflect shared neural processes rather than task-specific motor activity.