🤖 AI Summary
This work addresses the challenge of high parameter counts in existing intrinsic image decomposition methods, which hinder deployment in resource-constrained or real-time settings. To this end, we introduce flow matching—a technique previously unexplored in this domain—and propose a novel single-step decomposition framework operating in a variational autoencoder (VAE)-guided latent space. By jointly optimizing the latent representation and the flow matching module, our method efficiently and stably disentangles albedo and shading in a single inference step. Extensive experiments on multiple benchmark datasets demonstrate that the proposed model achieves decomposition performance comparable to or better than state-of-the-art approaches while using significantly fewer parameters. This favorable balance of efficiency and accuracy makes our approach particularly suitable for real-time and lightweight applications.
📝 Abstract
Intrinsic Image Decomposition (IID) separates an image into albedo and shading components. It is a core step in many real-world applications, such as relighting and material editing. Existing IID models achieve good results, but often use a large number of parameters. This makes them costly to combine with other models in real-world settings. To address this problem, we propose a flow matching-based solution. For this, we design a novel architecture, FlowIID, based on latent flow matching. FlowIID combines a VAE-guided latent space with a flow matching module, enabling a stable decomposition of albedo and shading. FlowIID is not only parameter-efficient, but also produces results in a single inference step. Despite its compact design, FlowIID delivers competitive and superior results compared to existing models across various benchmarks. This makes it well-suited for deployment in resource-constrained and real-time vision applications.