🤖 AI Summary
Existing Mamba-based image inpainting methods suffer from fixed scanning orders and inefficient feature utilization, limiting their adaptability to diverse degradations. To address this, we propose VAMamba, a vision-adaptive framework. First, we introduce the QCLAM caching mechanism, enabling low-rank dynamic fusion of historical features. Second, we design GPS-SS2D—a greedy path-searching strategy that leverages ViT-based importance scoring to generate optimal, degradation-aware scanning paths. Third, we enhance the SS2D architecture with LoRA adaptation, FIFO caching, and low-rank parameter updates to improve modeling flexibility and computational efficiency. Extensive experiments demonstrate that VAMamba consistently outperforms state-of-the-art methods across multiple inpainting benchmarks, achieving superior trade-offs between reconstruction fidelity and inference speed. Ablation studies validate the effectiveness of both adaptive scanning and memory-augmented mechanisms.
📝 Abstract
Recent Mamba-based image restoration methods have achieved promising results but remain
limited by fixed scanning patterns and inefficient feature utilization. Conventional Mamba
architectures rely on predetermined paths that cannot adapt to diverse degradations, constraining
both restoration performance and computational efficiency. To overcome these limitations, we
propose VAMamba, a Visual Adaptive Mamba framework with two key innovations. First,
QCLAM(Queue-basedCacheLow-rankAdaptiveMemory)enhancesfeaturelearningthrougha
FIFO cache that stores historical representations. Similarity between current LoRA-adapted and
cached features guides intelligent fusion, enabling dynamic reuse while effectively controlling
memorygrowth.Second, GPS-SS2D(GreedyPathScanSS2D)introducesadaptive scanning. A
Vision Transformer generates score maps to estimate pixel importance, and a greedy strategy de termines optimal forward and backward scanning paths. These learned trajectories replace rigid
patterns, enabling SS2D to perform targeted feature extraction. The integration of QCLAM and
GPS-SS2D allows VAMamba to adaptively focus on degraded regions while maintaining high
computational efficiency. Extensive experiments across diverse restoration tasks demonstrate
that VAMamba consistently outperforms existing approaches in both restoration quality and
efficiency, establishing new benchmarks for adaptive image restoration. Our code is available
at https://github.com/WaterHQH/VAMamba.