๐ค AI Summary
To address the challenge of jointly modeling short-range texture details and long-range semantic dependencies in single-image dehazing, this paper proposes DehazeMaticโa novel framework featuring a dual-path Transformer-Mamba module that separately captures local texture and global structural dependencies. It introduces, for the first time, CLIP-derived semantic priors jointly with self-estimated haze density maps to guide dynamic, multi-scale feature fusion in a semantics-aware manner, enabling adaptive dependency aggregation. By breaking the limitations of monolithic architectures, DehazeMatic significantly improves restoration robustness under non-uniform haze distributions and complex scenes. Extensive experiments demonstrate state-of-the-art performance on RESIDE and D-Hazy benchmarks, achieving average gains of +1.23 dB in PSNR and +0.018 in SSIM over prior methods. Moreover, the framework exhibits superior detail fidelity and enhanced cross-domain generalization capability.
๐ Abstract
Haze removal aims to restore a clear image from a hazy input. Existing methods have shown significant efficacy by capturing either short-range dependencies for local detail preservation or long-range dependencies for global context modeling. Given the complementary strengths of both approaches, a intuitive advancement is to explicitly integrate them into a unified framework. However, this potential remains underexplored in current research. In this paper, we propose extbf{DehazeMatic}, which leverages the proposed Transformer-Mamba Dual Aggregation block to simultaneously and explicitly captures both short- and long-range dependencies through dual-path design for improved restoration. To ensure that dependencies at varying ranges contribute optimally to performance, we conduct extensive experiments to identify key influencing factors and determine that an effective aggregation mechanism should be guided by the joint consideration of haze density and semantic information. Building on these insights, we introduce the CLIP-enhanced Dual-path Aggregator, which utilizes the rich semantic priors encapsulated in CLIP and the estimated haze density map, derived from its powerful generalization ability, to instruct the aggregation process. Extensive experiments demonstrate that DehazeMatic outperforms sort-of-the-art methods across various benchmarks.