🤖 AI Summary
This work addresses two fundamental tasks in magnetoencephalography (MEG) signal decoding—speech detection and phoneme classification—and proposes the first unified Conformer architecture specifically designed for MEG. Methodologically, it introduces a lightweight convolutional projection layer to handle raw 306-channel MEG data; incorporates task-specific decoding heads; designs an MEG-tailored SpecAugment strategy; employs a dynamic grouped loading mechanism for hundred-trial-averaged data; and applies instance-level normalization to mitigate distribution shift. Optimization leverages inverse-square-root class weighting and F1-macro–driven model selection. Experiments on the official benchmark yield leaderboard scores of 88.9% for speech detection and 65.8% for phoneme classification—substantially outperforming baselines and placing both tasks within the top ten. The proposed framework establishes a scalable, robust new paradigm for MEG-based speech decoding.
📝 Abstract
We present Conformer-based decoders for the LibriBrain 2025 PNPL competition, targeting two foundational MEG tasks: Speech Detection and Phoneme Classification. Our approach adapts a compact Conformer to raw 306-channel MEG signals, with a lightweight convolutional projection layer and task-specific heads. For Speech Detection, a MEG-oriented SpecAugment provided a first exploration of MEG-specific augmentation. For Phoneme Classification, we used inverse-square-root class weighting and a dynamic grouping loader to handle 100-sample averaged examples. In addition, a simple instance-level normalization proved critical to mitigate distribution shifts on the holdout split. Using the official Standard track splits and F1-macro for model selection, our best systems achieved 88.9% (Speech) and 65.8% (Phoneme) on the leaderboard, surpassing the competition baselines and ranking within the top-10 in both tasks. For further implementation details, the technical documentation, source code, and checkpoints are available at https://github.com/neural2speech/libribrain-experiments.