🤖 AI Summary
This paper addresses the ill-posed nature of joint reconstruction of object geometry and spatially varying bidirectional reflectance distribution functions (SVBRDFs) under sparse coaxial lighting. We propose an efficient, high-fidelity inverse rendering method. Methodologically, we introduce a novel spatially adaptive mesh resolution optimization scheme, coupled with block-wise high-resolution SVBRDF synthesis driven by a single pretrained decoder. We further incorporate differentiable rendering, adaptive mesh refinement, and curvature-aware surface smoothing constraints. In contrast to conventional approaches relying on fixed-resolution meshes or normal map parameterization, our framework overcomes fundamental limitations in resolution scalability and geometric expressivity. With only a small number of input images, it achieves joint high-fidelity reconstruction of geometric contours, cast shadows, and fine surface details—significantly enhancing detail fidelity and physical consistency while eliminating the need for post-processing.
📝 Abstract
Reconstructing an object's shape and appearance in terms of a mesh textured by a spatially-varying bidirectional reflectance distribution function (SVBRDF) from a limited set of images captured under collocated light is an ill-posed problem. Previous state-of-the-art approaches either aim to reconstruct the appearance directly on the geometry or additionally use texture normals as part of the appearance features. However, this requires detailed but inefficiently large meshes, that would have to be simplified in a post-processing step, or suffers from well-known limitations of normal maps such as missing shadows or incorrect silhouettes. Another limiting factor is the fixed and typically low resolution of the texture estimation resulting in loss of important surface details. To overcome these problems, we present ROSA, an inverse rendering method that directly optimizes mesh geometry with spatially adaptive mesh resolution solely based on the image data. In particular, we refine the mesh and locally condition the surface smoothness based on the estimated normal texture and mesh curvature. In addition, we enable the reconstruction of fine appearance details in high-resolution textures through a pioneering tile-based method that operates on a single pre-trained decoder network but is not limited by the network output resolution.