🤖 AI Summary
In snapshot spectral compressive imaging, reconstructing 3D hyperspectral cubes from 2D measurements is an ill-posed, nonlinear inverse problem. To address this, we propose the Mamba-inspired Joint Unfolding Network (MiJUN). Methodologically, MiJUN introduces: (i) a novel trapezoidal discretization scheme for accelerated iterative unfolding; (ii) a Transformer-like architecture derived from Mamba, integrating global-local attention mechanisms; and (iii) the first application of tensor mode-k unfolding with 12-directional scanning to exploit multimodal low-rank priors. By embedding the physical forward model, incorporating selective state-space modeling, and employing a second-order differential equation-driven half-quadratic splitting optimizer, MiJUN achieves significant improvements in reconstruction fidelity—particularly for texture, edges, and subtle spectral variations—on both simulated and real-world data. It establishes new state-of-the-art performance in both quantitative metrics and visual quality.
📝 Abstract
In the coded aperture snapshot spectral imaging system, Deep Unfolding Networks (DUNs) have made impressive progress in recovering 3D hyperspectral images (HSIs) from a single 2D measurement. However, the inherent nonlinear and ill-posed characteristics of HSI reconstruction still pose challenges to existing methods in terms of accuracy and stability. To address this issue, we propose a Mamba-inspired Joint Unfolding Network (MiJUN), which integrates physics-embedded DUNs with learning-based HSI imaging. Firstly, leveraging the concept of trapezoid discretization to expand the representation space of unfolding networks, we introduce an accelerated unfolding network scheme. This approach can be interpreted as a generalized accelerated half-quadratic splitting with a second-order differential equation, which reduces the reliance on initial optimization stages and addresses challenges related to long-range interactions. Crucially, within the Mamba framework, we restructure the Mamba-inspired global-to-local attention mechanism by incorporating a selective state space model and an attention mechanism. This effectively reinterprets Mamba as a variant of the Transformer} architecture, improving its adaptability and efficiency. Furthermore, we refine the scanning strategy with Mamba by integrating the tensor mode-$k$ unfolding into the Mamba network. This approach emphasizes the low-rank properties of tensors along various modes, while conveniently facilitating 12 scanning directions. Numerical and visual comparisons on both simulation and real datasets demonstrate the superiority of our proposed MiJUN, and achieving overwhelming detail representation.