๐ค AI Summary
Traditional deraining methods trained in the sRGB domain suffer from inherent degradations introduced by the image signal processor (ISP), including color crosstalk, dynamic range compression, and detail blurring. To address this, we propose the ISP-last low-level vision paradigmโfirst enabling end-to-end deraining directly in the native 12-bit Bayer domain, thereby bypassing ISP-induced distortions. To support this paradigm, we introduce Raw-Rain, the first real-world paired dataset of raw Bayer and corresponding sRGB images under rainy conditions. We further design the perceptually grounded, color-invariant metric ICS (Image Color Stability) to better align with human visual perception. Our method achieves +0.99 dB PSNR and +1.2% ICS improvements over state-of-the-art sRGB-domain approaches, while accelerating inference and reducing computational cost by 50% (halving GFLOPs). These results empirically validate the superiority and practicality of Bayer-domain reconstruction for deraining.
๐ Abstract
Image reconstruction from corrupted images is crucial across many domains. Most reconstruction networks are trained on post-ISP sRGB images, even though the image-signal-processing pipeline irreversibly mixes colors, clips dynamic range, and blurs fine detail. This paper uses the rain degradation problem as a use case to show that these losses are avoidable, and demonstrates that learning directly on raw Bayer mosaics yields superior reconstructions. To substantiate the claim, we (i) evaluate post-ISP and Bayer reconstruction pipelines, (ii) curate Raw-Rain, the first public benchmark of real rainy scenes captured in both 12-bit Bayer and bit-depth-matched sRGB, and (iii) introduce Information Conservation Score (ICS), a color-invariant metric that aligns more closely with human opinion than PSNR or SSIM. On the test split, our raw-domain model improves sRGB results by up to +0.99 dB PSNR and +1.2% ICS, while running faster with half of the GFLOPs. The results advocate an ISP-last paradigm for low-level vision and open the door to end-to-end learnable camera pipelines.