🤖 AI Summary
This work addresses the challenging problem of jointly reconstructing surface geometry, material properties, and scene illumination for glossy objects from multi-view images—without auxiliary supervision (e.g., depth, normals, or BRDF priors). We propose the first three-stage progressive inverse rendering framework based on neural implicit representations (SDF + radiance fields). Our method explicitly decouples direct/indirect illumination and specular/diffuse reflectance components—extending NeuS with novel modeling innovations: spherical Gaussian lighting, learnable visibility maps, specular-aware regularization, and joint SDF-radiance field regularization. Multi-stage knowledge distillation further enhances robustness. Evaluated on complex glossy objects, our approach achieves state-of-the-art performance: an 18% reduction in Chamfer distance, a 0.12 improvement in SSIM for material separation, and significantly enhanced fidelity in lighting estimation.
📝 Abstract
We develop a method that recovers the surface, materials, and illumination of a scene from its posed multi-view images. In contrast to prior work, it does not require any additional data and can handle glossy objects or bright lighting. It is a progressive inverse rendering approach, which consists of three stages. First, we reconstruct the scene radiance and signed distance function (SDF) with our novel regularization strategy for specular reflections. Our approach considers both the diffuse and specular colors, which allows for handling complex view-dependent lighting effects for surface reconstruction. Second, we distill light visibility and indirect illumination from the learned SDF and radiance field using learnable mapping functions. Third, we design a method for estimating the ratio of incoming direct light represented via Spherical Gaussians reflected in a specular manner and then reconstruct the materials and direct illumination of the scene. Experimental results demonstrate that the proposed method outperforms the current state-of-the-art in recovering surfaces, materials, and lighting without relying on any additional data.