🤖 AI Summary
Existing underwater video enhancement methods predominantly rely on frame-wise single-image models, leading to temporal inconsistencies and motion flickering. This paper introduces WaterWave—the first implicit neural video enhancement framework leveraging wavelet-domain temporal consistency modeling. Methodologically: (1) it establishes an unpaired implicit video representation from a local time-frequency perspective; (2) it enforces explicit wavelet-domain temporal consistency constraints to preserve inter-frame structural continuity; and (3) it integrates optical flow estimation with an adaptive correction module to improve dynamic scene modeling fidelity. Experiments demonstrate that WaterWave achieves 19.7% and 9.7% accuracy gains over prior art on the UOSTrack and MAT downstream tasks, respectively—significantly outperforming frame-wise enhancement baselines. The framework delivers high-quality, temporally coherent underwater video enhancement with superior motion stability and structural fidelity.
📝 Abstract
Underwater video pairs are fairly difficult to obtain due to the complex underwater imaging. In this case, most existing video underwater enhancement methods are performed by directly applying the single-image enhancement model frame by frame, but a natural issue is lacking temporal consistency. To relieve the problem, we rethink the temporal manifold inherent in natural videos and observe a temporal consistency prior in dynamic scenes from the local temporal frequency perspective. Building upon the specific prior and no paired-data condition, we propose an implicit representation manner for enhanced video signals, which is conducted in the wavelet-based temporal consistency field, WaterWave. Specifically, under the constraints of the prior, we progressively filter and attenuate the inconsistent components while preserving motion details and scenes, achieving a natural-flowing video. Furthermore, to represent temporal frequency bands more accurately, an underwater flow correction module is designed to rectify estimated flows considering the transmission in underwater scenes. Extensive experiments demonstrate that WaterWave significantly enhances the quality of videos generated using single-image underwater enhancements. Additionally, our method demonstrates high potential in downstream underwater tracking tasks, such as UOSTrack and MAT, outperforming the original video by a large margin, i.e., 19.7% and 9.7% on precise respectively.