🤖 AI Summary
To address the high computational cost and inability to achieve single-GPU real-time processing for ultra-high-definition (UHD) image deblurring, this paper proposes the Multi-Scale Cubic-Mixer network (Cubic-Mixer). Departing from computationally expensive self-attention mechanisms, Cubic-Mixer directly models the complex-valued Fourier transform (FFT) outputs—jointly processing real and imaginary components—and estimates frequency-domain coefficients in the Fourier domain. It introduces the first self-attention-free cubic mixing architecture, pioneering the integration of multi-scale hybrid operations within the complex frequency domain. Coupled with a sliding-patch inference strategy, the method enables real-time UHD deblurring (>25 FPS) on a single GPU for 4K and 8K images. Extensive experiments demonstrate state-of-the-art accuracy on multiple established benchmarks and a newly introduced UHD dataset, alongside a 3.2× speedup in inference latency.
📝 Abstract
Currently, transformer-based algorithms are making a splash in the domain of image deblurring. Their achievement depends on the self-attention mechanism with CNN stem to model long range dependencies between tokens. Unfortunately, this ear-pleasing pipeline introduces high computational complexity and makes it difficult to run an ultra-high-definition image on a single GPU in real time. To trade-off accuracy and efficiency, the input degraded image is computed cyclically over three dimensional ($C$, $W$, and $H$) signals without a self-attention mechanism. We term this deep network as Multi-scale Cubic-Mixer, which is acted on both the real and imaginary components after fast Fourier transform to estimate the Fourier coefficients and thus obtain a deblurred image. Furthermore, we combine the multi-scale cubic-mixer with a slicing strategy to generate high-quality results at a much lower computational cost. Experimental results demonstrate that the proposed algorithm performs favorably against the state-of-the-art deblurring approaches on the several benchmarks and a new ultra-high-definition dataset in terms of accuracy and speed.