🤖 AI Summary
To address inaccurate 3D human pose estimation under real-world challenges—including occlusion, missing views, and sensor noise—this paper introduces the novel task of Defect-Aware 3D Pose Estimation (DA-3DPE). Methodologically, we abandon conventional multi-stage pipelines in favor of a lightweight, single-stage end-to-end architecture. We further propose the first adaptive multi-view feature fusion mechanism guided by relative projection error, enabling dynamic, uncertainty-aware weighting during feature aggregation to enhance robustness. To support this task, we construct DA-3DPE, the first benchmark dataset explicitly designed for defect modeling in 3D pose estimation. Extensive experiments demonstrate that our approach achieves significant improvements over state-of-the-art methods on the DA-3DPE benchmark, while also delivering consistent performance gains on standard benchmarks such as Human3.6M—validating its strong robustness and generalization capability.
📝 Abstract
3D human pose estimation has wide applications in fields such as intelligent surveillance, motion capture, and virtual reality. However, in real-world scenarios, issues such as occlusion, noise interference, and missing viewpoints can severely affect pose estimation. To address these challenges, we introduce the task of Deficiency-Aware 3D Pose Estimation. Traditional 3D pose estimation methods often rely on multi-stage networks and modular combinations, which can lead to cumulative errors and increased training complexity, making them unable to effectively address deficiency-aware estimation. To this end, we propose DeProPose, a flexible method that simplifies the network architecture to reduce training complexity and avoid information loss in multi-stage designs. Additionally, the model innovatively introduces a multi-view feature fusion mechanism based on relative projection error, which effectively utilizes information from multiple viewpoints and dynamically assigns weights, enabling efficient integration and enhanced robustness to overcome deficiency-aware 3D Pose Estimation challenges. Furthermore, to thoroughly evaluate this end-to-end multi-view 3D human pose estimation model and to advance research on occlusion-related challenges, we have developed a novel 3D human pose estimation dataset, termed the Deficiency-Aware 3D Pose Estimation (DA-3DPE) dataset. This dataset encompasses a wide range of deficiency scenarios, including noise interference, missing viewpoints, and occlusion challenges. Compared to state-of-the-art methods, DeProPose not only excels in addressing the deficiency-aware problem but also shows improvement in conventional scenarios, providing a powerful and user-friendly solution for 3D human pose estimation. The source code will be available at https://github.com/WUJINHUAN/DeProPose.