StableIntrinsic: Detail-preserving One-step Diffusion Model for Multi-view Material Estimation

📅 2025-08-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing multi-step denoising diffusion models for multi-view material estimation suffer from high inference latency and large output variance, hindering deterministic material recovery. This paper proposes StableIntrinsic—the first end-to-end, single-step diffusion model specifically designed for intrinsic material estimation. Its key contributions are: (1) a deterministic single-step diffusion architecture eliminating stochastic iterative sampling; (2) a material-aware pixel-wise multi-task loss function jointly optimizing albedo, metallic, and roughness; and (3) a Detail Injection Network (DIN) that mitigates texture degradation induced by VAE encoding. Trained on multi-view RGB images, StableIntrinsic achieves state-of-the-art performance: +9.9 dB PSNR on albedo, and 44.4% and 60.0% reductions in MSE for metallic and roughness maps, respectively—significantly outperforming prior methods.

Technology Category

Application Category

📝 Abstract
Recovering material information from images has been extensively studied in computer graphics and vision. Recent works in material estimation leverage diffusion model showing promising results. However, these diffusion-based methods adopt a multi-step denoising strategy, which is time-consuming for each estimation. Such stochastic inference also conflicts with the deterministic material estimation task, leading to a high variance estimated results. In this paper, we introduce StableIntrinsic, a one-step diffusion model for multi-view material estimation that can produce high-quality material parameters with low variance. To address the overly-smoothing problem in one-step diffusion, StableIntrinsic applies losses in pixel space, with each loss designed based on the properties of the material. Additionally, StableIntrinsic introduces a Detail Injection Network (DIN) to eliminate the detail loss caused by VAE encoding, while further enhancing the sharpness of material prediction results. The experimental results indicate that our method surpasses the current state-of-the-art techniques by achieving a $9.9%$ improvement in the Peak Signal-to-Noise Ratio (PSNR) of albedo, and by reducing the Mean Square Error (MSE) for metallic and roughness by $44.4%$ and $60.0%$, respectively.
Problem

Research questions and friction points this paper is trying to address.

Addressing multi-step diffusion inefficiency in material estimation
Reducing variance in deterministic material parameter prediction
Solving detail loss from VAE encoding in diffusion models
Innovation

Methods, ideas, or system contributions that make the work stand out.

One-step diffusion model for material estimation
Detail Injection Network enhances prediction sharpness
Losses in pixel space based on material properties
🔎 Similar Papers
No similar papers found.
X
Xiuchao Wu
State Key Lab of CAD&CG, Zhejiang University, China and Alibaba Group, China
P
Pengfei Zhu
State Key Lab for Novel Software Technology, Nanjing University, China and Alibaba Group, China
Jiangjing Lyu
Jiangjing Lyu
Alibaba
Computer VisionComputer Graphics
X
Xinguo Liu
State Key Laboratory of CAD&CG, Zhejiang University, China
J
Jie Guo
State Key Lab for Novel Software Technology, Nanjing University, China
Y
Yanwen Guo
State Key Lab for Novel Software Technology, Nanjing University, China
W
Weiwei Xu
State Key Lab of CAD&CG, Zhejiang University, China
C
Chengfei Lyu
Alibaba Group, China