Ani3DHuman: Photorealistic 3D Human Animation with Self-guided Stochastic Sampling

๐Ÿ“… 2026-02-22
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Existing 3D human animation methods struggle to simultaneously achieve photorealism and identity consistency: kinematic approaches lack non-rigid dynamics such as cloth flutter, while video diffusionโ€“based methods, though capable of modeling such details, often suffer from visual artifacts and identity distortion. This work proposes a hierarchical framework that integrates kinematic priors with a video diffusion model by decoupling rigid and non-rigid motion representations. The approach leverages kinematically rendered guidance to steer the diffusion model in recovering high-fidelity non-rigid details and introduces a self-guided stochastic sampling strategy to effectively mitigate sampling failures caused by out-of-distribution inputs. Experiments demonstrate that the method significantly outperforms state-of-the-art techniques in complex non-rigid dynamic scenarios, achieving leading performance in both visual realism and identity fidelity.

Technology Category

Application Category

๐Ÿ“ Abstract
Current 3D human animation methods struggle to achieve photorealism: kinematics-based approaches lack non-rigid dynamics (e.g., clothing dynamics), while methods that leverage video diffusion priors can synthesize non-rigid motion but suffer from quality artifacts and identity loss. To overcome these limitations, we present Ani3DHuman, a framework that marries kinematics-based animation with video diffusion priors. We first introduce a layered motion representation that disentangles rigid motion from residual non-rigid motion. Rigid motion is generated by a kinematic method, which then produces a coarse rendering to guide the video diffusion model in generating video sequences that restore the residual non-rigid motion. However, this restoration task, based on diffusion sampling, is highly challenging, as the initial renderings are out-of-distribution, causing standard deterministic ODE samplers to fail. Therefore, we propose a novel self-guided stochastic sampling method, which effectively addresses the out-of-distribution problem by combining stochastic sampling (for photorealistic quality) with self-guidance (for identity fidelity). These restored videos provide high-quality supervision, enabling the optimization of the residual non-rigid motion field. Extensive experiments demonstrate that \MethodName can generate photorealistic 3D human animation, outperforming existing methods. Code is available in https://github.com/qiisun/ani3dhuman.
Problem

Research questions and friction points this paper is trying to address.

photorealistic
3D human animation
non-rigid dynamics
identity preservation
video diffusion
Innovation

Methods, ideas, or system contributions that make the work stand out.

self-guided stochastic sampling
layered motion representation
video diffusion priors
non-rigid dynamics
photorealistic 3D animation
๐Ÿ”Ž Similar Papers
No similar papers found.