π€ AI Summary
This work addresses the challenge of single-image reflection separation, where the transmission and reflection layers are highly nonlinearly coupled in real-world imagingβa complexity that existing methods fail to capture accurately due to their reliance on simplified linear superposition models. To overcome this limitation, the authors propose a learnable nonlinear superposition model that abandons the linearity assumption in the sRGB domain, thereby more faithfully representing inter-layer interactions. They further introduce a unified dual-stream interaction framework that explicitly models the bidirectional dependencies between transmission and reflection through a fusion of activation, gating, and attention mechanisms, while remaining compatible with both CNN and Transformer backbones. The method achieves state-of-the-art performance across multiple real-world benchmarks and demonstrates strong generalization, offering a novel paradigm for image decomposition tasks.
π Abstract
Single-image reflection separation is fundamentally challenged by the entanglement of transmission and reflection layers under complex image formation processes. Existing approaches largely rely on simplified assumptions or independent modeling, limiting their ability to handle real-world scenarios. In this work, we revisit the problem from a unified perspective and identify a key issue of existing approaches, i.e., the widely adopted linear composition model in the sRGB domain fails to capture the nonlinear coupling introduced by real-world image signal processing pipelines. To address this, we introduce a learnable nonlinear superposition model that more faithfully characterizes layer interactions and improves decomposition fidelity. Building upon this formulation, we propose a generalized dual-stream interactive framework that explicitly models bidirectional dependencies between transmission and reflection through feature exchange. This framework unifies activation-, gating-, and attention-based interaction mechanisms, and is compatible with both CNN and Transformer backbones. Extensive experiments on diverse real-world benchmarks demonstrate that the proposed approach achieves superior performance with strong generalization capability. More importantly, our study reveals that reflection separation is not about undoing a linear mixture, but about learning nonlinear formation and interaction}, offering new insights into the design of principled image decomposition models. Code and models are publicly available at https://mingcv.github.io/DIRS-Page.