🤖 AI Summary
Human pose estimation faces a fundamental trade-off among accuracy, efficiency, and uncertainty quantification (UQ): regression-based methods suffer from weak UQ due to restrictive distributional assumptions, while heatmap-based approaches offer flexible modeling at the cost of high computational overhead. This paper proposes Continuous Flow Residual Estimation (CFRE), the first method to embed Continuous Normalizing Flows (CNFs) into a regression framework. CFRE employs neural ordinary differential equations (neural ODEs) to dynamically model output distributions via residual flows, eliminating dependence on heatmap decoding and fixed parametric distribution assumptions. It enables differentiable uncertainty calibration and achieves a 37% reduction in Expected Calibration Error (ECE) on both 2D and 3D pose estimation benchmarks. Moreover, CFRE attains significantly improved localization accuracy while maintaining inference speed comparable to lightweight regression models—substantially outperforming heatmap-based baselines.
📝 Abstract
Human Pose Estimation (HPE) is increasingly important for applications like virtual reality and motion analysis, yet current methods struggle with balancing accuracy, computational efficiency, and reliable uncertainty quantification (UQ). Traditional regression-based methods assume fixed distributions, which might lead to poor UQ. Heatmap-based methods effectively model the output distribution using likelihood heatmaps, however, they demand significant resources. To address this, we propose Continuous Flow Residual Estimation (CFRE), an integration of Continuous Normalizing Flows (CNFs) into regression-based models, which allows for dynamic distribution adaptation. Through extensive experiments, we show that CFRE leads to better accuracy and uncertainty quantification with retained computational efficiency on both 2D and 3D human pose estimation tasks.