Mind-to-Face: Neural-Driven Photorealistic Avatar Synthesis via EEG Decoding

๐Ÿ“… 2025-12-03
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Existing avatar systems heavily rely on visual cues, failing under facial occlusion or when emotions are implicit. This work proposes the first non-invasive EEG-driven personalized virtual avatar system, enabling end-to-end generation of high-fidelity, dynamic 3D facial expressions directly from neural signals. Methodologically, we design a CNN-Transformer encoder to extract affective and fine-grained facial motion features from EEG, mapping them to dense 3D positional maps; these are rendered via an enhanced 3D Gaussian Splatting pipeline to ensure photorealism and multi-view consistency. Experiments demonstrate that the system accurately reconstructs subject-specific, subtle emotional expressions solely from EEGโ€”maintaining stable, high-fidelity dynamic reconstruction even without visual input or under facial occlusion. This study is the first to empirically verify that EEG encodes sufficient geometric and affective dynamic information for facial animation, thereby establishing a novel paradigm for neural-driven avatars.

Technology Category

Application Category

๐Ÿ“ Abstract
Current expressive avatar systems rely heavily on visual cues, failing when faces are occluded or when emotions remain internal. We present Mind-to-Face, the first framework that decodes non-invasive electroencephalogram (EEG) signals directly into high-fidelity facial expressions. We build a dual-modality recording setup to obtain synchronized EEG and multi-view facial video during emotion-eliciting stimuli, enabling precise supervision for neural-to-visual learning. Our model uses a CNN-Transformer encoder to map EEG signals into dense 3D position maps, capable of sampling over 65k vertices, capturing fine-scale geometry and subtle emotional dynamics, and renders them through a modified 3D Gaussian Splatting pipeline for photorealistic, view-consistent results. Through extensive evaluation, we show that EEG alone can reliably predict dynamic, subject-specific facial expressions, including subtle emotional responses, demonstrating that neural signals contain far richer affective and geometric information than previously assumed. Mind-to-Face establishes a new paradigm for neural-driven avatars, enabling personalized, emotion-aware telepresence and cognitive interaction in immersive environments.
Problem

Research questions and friction points this paper is trying to address.

Decodes EEG signals into high-fidelity facial expressions
Captures subtle emotional dynamics from neural activity
Enables neural-driven avatars for emotion-aware telepresence
Innovation

Methods, ideas, or system contributions that make the work stand out.

EEG signals decoded into high-fidelity facial expressions
CNN-Transformer encoder maps EEG to dense 3D position maps
Modified 3D Gaussian Splatting renders photorealistic, view-consistent avatars
H
Haolin Xiong
Institute for Creative Technologies, University of Southern California
T
Tianwen Fu
Institute for Creative Technologies, University of Southern California
P
Pratusha Bhuvana Prasad
Institute for Creative Technologies, University of Southern California
Y
Yunxuan Cai
Institute for Creative Technologies, University of Southern California
Haiwei Chen
Haiwei Chen
University of Southern California
Computer Vision
Wenbin Teng
Wenbin Teng
University of Southern California
Computer VisionGenerative Model3D reconstruction
H
Hanyuan Xiao
Institute for Creative Technologies, University of Southern California
Yajie Zhao
Yajie Zhao
Computer Scientist at University of Southern California
Virtual HumanNeural RenderAR/VR