LumiNet: Latent Intrinsics Meets Diffusion Models for Indoor Scene Relighting

📅 2024-11-29

🏛️ Computer Vision and Pattern Recognition

📈 Citations: 9

✨ Influential: 2

career value

165K/year

🤖 AI Summary

This work addresses single-image relighting of indoor scenes, aiming to synthesize photorealistic outputs that preserve the original geometry and material properties while accurately reproducing target illumination—including specular highlights and global illumination—using only a source RGB image and a target lighting map. To this end, we propose a dual-image latent-space co-control paradigm: (1) constructing a large-scale, synthetic lighting-scene paired dataset using StyleGAN; (2) designing a novel ControlNet variant that explicitly disentangles and fuses intrinsic scene attributes (albedo and geometry) from the source image with extrinsic lighting representations from the target; and (3) incorporating cross-attention mechanisms and lightweight adapters for efficient fine-tuning. Extensive experiments on complex indoor scenes demonstrate substantial improvements over state-of-the-art methods. Our approach enables robust lighting transfer across diverse scene layouts and material types, requiring only RGB inputs.

Technology Category

Application Category

📝 Abstract

We introduce LumiNet, a novel architecture that leverages generative models and latent intrinsic representations for transferring lighting from one image to another. Given a source image and a target lighting image, LumiNet generates a relit version of the source scene that captures the target’s lighting. Our approach makes two key contributions: a data curation strategy from the StyleGAN-based relighting model for our training, and a modified diffusion-based Con-trolNet that processes both latent intrinsic properties from the source image and latent extrinsic properties from the target image. We further improve lighting transfer through a learned adaptor that injects the target’s latent extrinsic properties via cross-attention and light-weight fine-tuning.Unlike traditional ControlNet, which generates images with conditional maps from a single scene, LumiNet processes latent representations from two different images -preserving geometry and albedo from the source while transferring lighting characteristics from the target. Experiments demonstrate that our method successfully transfers complex lighting phenomena including specular highlights and indirect illumination across scenes with varying spatial layouts and materials, outperforming existing approaches on challenging indoor scenes using only images as input.

Problem

Research questions and friction points this paper is trying to address.

Transfer lighting between indoor scenes

Preserve geometry and albedo from source

Handle complex lighting phenomena like specular highlights

Innovation

Methods, ideas, or system contributions that make the work stand out.

Leverages generative models and latent intrinsics

Uses modified ControlNet for dual-image processing

Employs learned MLP adaptor for lighting injection

🔎 Similar Papers

No similar papers found.