PAD-Hand: Physics-Aware Diffusion for Hand Motion Recovery

๐Ÿ“… 2026-03-27
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work proposes a physics-aware conditional diffusion framework to address the common lack of physical consistency in existing hand motion reconstruction methods and the difficulty in quantifying the physical plausibility of estimated results. By incorporating dynamical residuals as virtual observations into the diffusion process and integrating Eulerโ€“Lagrange dynamics modeling with a MeshCNN-Transformer backbone, the method refines noisy pose sequences into physically plausible hand motions. Innovatively, a Laplace approximation is applied at the final layer of the diffusion model to produce spatiotemporally interpretable variance maps that reflect physical consistency. Experiments demonstrate that the proposed approach outperforms strong image-initialized and state-of-the-art video-based methods on two mainstream hand datasets, with qualitative results confirming a high correspondence between the estimated variance and physical plausibility.
๐Ÿ“ Abstract
Significant advancements made in reconstructing hands from images have delivered accurate single-frame estimates, yet they often lack physics consistency and provide no notion of how confidently the motion satisfies physics. In this paper, we propose a novel physics-aware conditional diffusion framework that refines noisy pose sequences into physically plausible hand motion while estimating the physics variance in motion estimates. Building on a MeshCNN-Transformer backbone, we formulate Euler-Lagrange dynamics for articulated hands. Unlike prior works that enforce zero residuals, we treat the resulting dynamic residuals as virtual observables to more effectively integrate physics. Through a last-layer Laplace approximation, our method produces per-joint, per-time variances that measure physics consistency and offers interpretable variance maps indicating where physical consistency weakens. Experiments on two well-known hand datasets show consistent gains over strong image-based initializations and competitive video-based methods. Qualitative results confirm that our variance estimations are aligned with the physical plausibility of the motion in image-based estimates.
Problem

Research questions and friction points this paper is trying to address.

hand motion recovery
physics consistency
dynamic residuals
pose refinement
physical plausibility
Innovation

Methods, ideas, or system contributions that make the work stand out.

physics-aware diffusion
hand motion recovery
Euler-Lagrange dynamics
dynamic residuals as observables
Laplace approximation
๐Ÿ”Ž Similar Papers
No similar papers found.