🤖 AI Summary
This work proposes SpiralDiff, a unified diffusion framework for RGB-to-RAW conversion that addresses the challenges of intensity-dependent reconstruction difficulty and the need to adapt to diverse in-camera ISP characteristics across multiple devices. SpiralDiff integrates a signal-dependent noise weighting strategy with a lightweight, camera-aware adaptation module, CamLoRA, enabling intensity-adaptive RAW reconstruction and cross-camera compatibility within a single model. By leveraging signal-aware noise scheduling and plug-and-play LoRA-based fine-tuning, SpiralDiff significantly improves RAW synthesis quality across four benchmark datasets and effectively enhances downstream object detection performance in the RAW domain.
📝 Abstract
RAW images preserve superior fidelity and rich scene information compared to RGB, making them essential for tasks in challenging imaging conditions. To alleviate the high cost of data collection, recent RGB-to-RAW conversion methods aim to synthesize RAW images from RGB. However, they overlook two key challenges: (i) the reconstruction difficulty varies with pixel intensity, and (ii) multi-camera conversion requires camera-specific adaptation. To address these issues, we propose SpiralDiff, a diffusion-based framework tailored for RGB-to-RAW conversion with a signal-dependent noise weighting strategy that adapts reconstruction fidelity across intensity levels. In addition, we introduce CamLoRA, a camera-aware lightweight adaptation module that enables a unified model to adapt to different camera-specific ISP characteristics. Extensive experiments on four benchmark datasets demonstrate the superiority of SpiralDiff in RGB-to-RAW conversion quality and its downstream benefits in RAW-based object detection. Our code and model are available at https://github.com/Chuancy-TJU/SpiralDiff.