๐ค AI Summary
This study addresses a critical gap in research on the neural mechanisms of speech production and speech-based brainโcomputer interfaces by presenting a large-scale, multimodal dataset comprising over 1,000 hours of synchronized high-density EEG (62โ128 channels), facial EMG, and high-fidelity audio recordings from three native Japanese speakers during natural speech. The data were collected across multiple sessions over extended periods using diverse recording setups and include time-aligned transcripts and event annotations. Adhering to the BIDS standard, the dataset has been publicly released via OpenNeuro under a CC0 license. Validation analyses confirm expected neurophysiological signatures, including canonical 1/f spectral profiles, task-related alpha suppression, and time-locked evoked responses. This resource constitutes the first open, multimodal speech dataset at the thousand-hour scale, enabling cross-device, longitudinal, and open-vocabulary speech decoding research.
๐ Abstract
We present a multimodal dataset of 1020 hours of simultaneously recorded scalp electroencephalography (EEG), facial electromyography (EMG), and speech audio from three healthy native Japanese speakers during open-vocabulary overt speech. Recordings were acquired with three EEG systems-an ultra-high-density system (g.Pangolin) and two cap-type systems (g.SCARABEO and eegosports), spanning 62-128 channels-across many sessions over several months. Each session provides time-synchronized EEG, facial EMG, and audio, together with speech-event annotations and transcriptions. Although collected with speech decoding as a primary motivation, the dataset also supports work on multimodal signal processing, artifact modeling, longitudinal and cross-device adaptation, and EEG representation learning. Technical validation included power spectral density and event-related potential analyses across participants, devices, and tasks, which showed the expected 1/f spectral profile, task-related alpha-band attenuation, and time-locked evoked responses. The dataset is released in Brain Imaging Data Structure (BIDS) format via OpenNeuro under a CC0 waiver to support both speech-related and broader EEG research.