EventEgo3D++: 3D Human Motion Capture from a Head-Mounted Event Camera

📅 2025-02-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the failure of head-mounted monocular RGB systems under low-light conditions and high-speed motion, this paper proposes the first egocentric 3D human pose estimation framework based on a fisheye event camera. Methodologically, we design a lightweight normalized event representation (LNES) that jointly incorporates fisheye geometric modeling and SMPL-based human priors, enabling end-to-end direct regression from event streams to 3D poses. Our contributions are threefold: (1) the first introduction of LNES for event encoding; (2) the release of the first paired egocentric event–allocentric RGB multimodal head-mounted device (HMD) dataset, comprising both real and synthetic sequences; and (3) state-of-the-art 3D pose accuracy on challenging scenarios, achieving real-time inference at 140 Hz while significantly improving illumination invariance and motion robustness.

Technology Category

Application Category

📝 Abstract
Monocular egocentric 3D human motion capture remains a significant challenge, particularly under conditions of low lighting and fast movements, which are common in head-mounted device applications. Existing methods that rely on RGB cameras often fail under these conditions. To address these limitations, we introduce EventEgo3D++, the first approach that leverages a monocular event camera with a fisheye lens for 3D human motion capture. Event cameras excel in high-speed scenarios and varying illumination due to their high temporal resolution, providing reliable cues for accurate 3D human motion capture. EventEgo3D++ leverages the LNES representation of event streams to enable precise 3D reconstructions. We have also developed a mobile head-mounted device (HMD) prototype equipped with an event camera, capturing a comprehensive dataset that includes real event observations from both controlled studio environments and in-the-wild settings, in addition to a synthetic dataset. Additionally, to provide a more holistic dataset, we include allocentric RGB streams that offer different perspectives of the HMD wearer, along with their corresponding SMPL body model. Our experiments demonstrate that EventEgo3D++ achieves superior 3D accuracy and robustness compared to existing solutions, even in challenging conditions. Moreover, our method supports real-time 3D pose updates at a rate of 140Hz. This work is an extension of the EventEgo3D approach (CVPR 2024) and further advances the state of the art in egocentric 3D human motion capture. For more details, visit the project page at https://eventego3d.mpi-inf.mpg.de.
Problem

Research questions and friction points this paper is trying to address.

Captures 3D human motion in low light.
Uses event camera for fast movements.
Enables real-time 3D pose updates.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses monocular event camera
Implements LNES event representation
Supports real-time 140Hz updates
🔎 Similar Papers
No similar papers found.
C
Christen Millerdurai
Augmented Vision, German Research Center for Artificial Intelligence (DFKI), Trippstadter Str. 122, Kaiserslautern, 67663, Rhineland-Palatinate, Germany.
Hiroyasu Akada
Hiroyasu Akada
Max Planck Institute for Informatics
computer visiondeep learninghuman pose estimation
J
Jian Wang
Visual Computing and Artificial Intelligence, Max Planck Institute for Informatics, SIC, Stuhlsatzenhausweg E1 4, Saarbr¨ ucken, 66123, Saarland, Germany.
D
D. Luvizon
Visual Computing and Artificial Intelligence, Max Planck Institute for Informatics, SIC, Stuhlsatzenhausweg E1 4, Saarbr¨ ucken, 66123, Saarland, Germany.
A
A. Pagani
Augmented Vision, German Research Center for Artificial Intelligence (DFKI), Trippstadter Str. 122, Kaiserslautern, 67663, Rhineland-Palatinate, Germany.
Didier Stricker
Didier Stricker
Professor for Computer Science, University Kaiserslautern
augmented realitycomputer visionimage processingbody sensor networkshci
C
C. Theobalt
Visual Computing and Artificial Intelligence, Max Planck Institute for Informatics, SIC, Stuhlsatzenhausweg E1 4, Saarbr¨ ucken, 66123, Saarland, Germany.
Vladislav Golyanik
Vladislav Golyanik
Senior Researcher, MPI for Informatics
3D reconstructionneural renderinggenerative modelsquantum computer vision