Brain-Gen: Towards Interpreting Neural Signals for Stimulus Reconstruction Using Transformers and Latent Diffusion Models

📅 2025-12-21

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Reconstructing visual stimuli from noisy, spatially diffuse, and temporally variable EEG signals remains challenging. To address this, we propose a novel cross-modal EEG-to-image decoding framework that embeds spatiotemporal Transformer-extracted EEG representations into the attention mechanism of a latent diffusion model (LDM), enabling attention-guided and interpretable feature fusion. This work establishes the first synergistic modeling of Transformer-derived neural representations and LDM attention, eliminating reliance on fixed stimulus sets—a key limitation of prior methods—and substantially improving semantic interpretability and cross-category generalization. On public benchmarks, our approach achieves a 6.5% improvement in latent-space clustering accuracy, an 11.8% gain in zero-shot reconstruction performance, and attains state-of-the-art Inception Score and Fréchet Inception Distance (FID).

Technology Category

Application Category

📝 Abstract

Advances in neuroscience and artificial intelligence have enabled preliminary decoding of brain activity. However, despite the progress, the interpretability of neural representations remains limited. A significant challenge arises from the intrinsic properties of electroencephalography (EEG) signals, including high noise levels, spatial diffusion, and pronounced temporal variability. To interpret the neural mechanism underlying thoughts, we propose a transformers-based framework to extract spatial-temporal representations associated with observed visual stimuli from EEG recordings. These features are subsequently incorporated into the attention mechanisms of Latent Diffusion Models (LDMs) to facilitate the reconstruction of visual stimuli from brain activity. The quantitative evaluations on publicly available benchmark datasets demonstrate that the proposed method excels at modeling the semantic structures from EEG signals; achieving up to 6.5% increase in latent space clustering accuracy and 11.8% increase in zero shot generalization across unseen classes while having comparable Inception Score and Fréchet Inception Distance with existing baselines. Our work marks a significant step towards generalizable semantic interpretation of the EEG signals.

Problem

Research questions and friction points this paper is trying to address.

Reconstruct visual stimuli from noisy EEG signals

Interpret neural mechanisms underlying thoughts via EEG

Enhance semantic modeling and generalization from brain activity

Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformers extract spatiotemporal EEG features

Latent Diffusion Models reconstruct visual stimuli

Attention mechanisms integrate neural representations

🔎 Similar Papers

No similar papers found.

Authors to Follow