A 1000-hour EEG-EMG-audio dataset of Japanese speech production

๐Ÿ“… 2026-05-31
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF

career value

217K/year
๐Ÿค– AI Summary
This study addresses a critical gap in research on the neural mechanisms of speech production and speech-based brainโ€“computer interfaces by presenting a large-scale, multimodal dataset comprising over 1,000 hours of synchronized high-density EEG (62โ€“128 channels), facial EMG, and high-fidelity audio recordings from three native Japanese speakers during natural speech. The data were collected across multiple sessions over extended periods using diverse recording setups and include time-aligned transcripts and event annotations. Adhering to the BIDS standard, the dataset has been publicly released via OpenNeuro under a CC0 license. Validation analyses confirm expected neurophysiological signatures, including canonical 1/f spectral profiles, task-related alpha suppression, and time-locked evoked responses. This resource constitutes the first open, multimodal speech dataset at the thousand-hour scale, enabling cross-device, longitudinal, and open-vocabulary speech decoding research.
๐Ÿ“ Abstract
We present a multimodal dataset of 1020 hours of simultaneously recorded scalp electroencephalography (EEG), facial electromyography (EMG), and speech audio from three healthy native Japanese speakers during open-vocabulary overt speech. Recordings were acquired with three EEG systems-an ultra-high-density system (g.Pangolin) and two cap-type systems (g.SCARABEO and eegosports), spanning 62-128 channels-across many sessions over several months. Each session provides time-synchronized EEG, facial EMG, and audio, together with speech-event annotations and transcriptions. Although collected with speech decoding as a primary motivation, the dataset also supports work on multimodal signal processing, artifact modeling, longitudinal and cross-device adaptation, and EEG representation learning. Technical validation included power spectral density and event-related potential analyses across participants, devices, and tasks, which showed the expected 1/f spectral profile, task-related alpha-band attenuation, and time-locked evoked responses. The dataset is released in Brain Imaging Data Structure (BIDS) format via OpenNeuro under a CC0 waiver to support both speech-related and broader EEG research.
Problem

Research questions and friction points this paper is trying to address.

EEG
EMG
speech production
multimodal dataset
Japanese
Innovation

Methods, ideas, or system contributions that make the work stand out.

multimodal dataset
EEG-EMG-audio synchronization
speech decoding
cross-device EEG
BIDS format
๐Ÿ”Ž Similar Papers
No similar papers found.
M
Motoshige Sato
Araya Inc., Tokyo, Japan
I
Ilya Horiguchi
Araya Inc., Tokyo, Japan
M
Masakazu Inoue
Araya Inc., Tokyo, Japan
K
Kenichi Tomeoka
Araya Inc., Tokyo, Japan
E
Eri Hatakeyama
Araya Inc., Tokyo, Japan
Y
Yuya Kita
Araya Inc., Tokyo, Japan
A
Atsushi Yamamoto
Araya Inc., Tokyo, Japan
Ippei Fujisawa
Ippei Fujisawa
Araya Inc.
Artificial IntelligenceDeep LearningComputer VisionPhysics
Shuntaro Sasai
Shuntaro Sasai
Araya Inc.
neuroscience