Memory-Efficient Optical Flow via Radius-Distribution Orthogonal Cost Volume

📅 2023-12-06
🏛️ arXiv.org
📈 Citations: 5
Influential: 0
📄 PDF
🤖 AI Summary
To address the memory explosion problem in high-resolution optical flow estimation—caused by the quadratic memory complexity (O(HW^2)) of conventional 4D cost volumes—the paper introduces a recurrent local orthogonal cost volume representation. This method dynamically decomposes the 2D search space into two orthogonal 1D subspaces, reducing memory complexity to (O(HW)). Technically, it proposes a novel radius-distribution multi-scale search strategy, leverages self-attention for cross-subspace feature propagation, and incorporates recurrent neural networks to model temporal dependencies. Evaluated on Sintel and KITTI benchmarks, the approach achieves state-of-the-art accuracy. On 4K images (2160×3840), it significantly reduces GPU memory consumption compared to RAFT and Transformer-based baselines, while maintaining high precision. The method thus bridges the gap between accuracy and memory efficiency for large-scale optical flow estimation.
📝 Abstract
The full 4D cost volume in Recurrent All-Pairs Field Transforms (RAFT) or global matching by Transformer achieves impressive performance for optical flow estimation. However, their memory consumption increases quadratically with input resolution, rendering them impractical for high-resolution images. In this paper, we present MeFlow, a novel memory-efficient method for high-resolution optical flow estimation. The key of MeFlow is a recurrent local orthogonal cost volume representation, which decomposes the 2D search space dynamically into two 1D orthogonal spaces, enabling our method to scale effectively to very high-resolution inputs. To preserve essential information in the orthogonal space, we utilize self attention to propagate feature information from the 2D space to the orthogonal space. We further propose a radius-distribution multi-scale lookup strategy to model the correspondences of large displacements at a negligible cost. We verify the efficiency and effectiveness of our method on the challenging Sintel and KITTI benchmarks, and real-world 4K ($2160! imes!3840$) images. Our method achieves competitive performance on both Sintel and KITTI benchmarks, while maintaining the highest memory efficiency on high-resolution inputs.
Problem

Research questions and friction points this paper is trying to address.

Reduces memory consumption in high-resolution optical flow estimation.
Decomposes 2D search space into 1D orthogonal spaces for scalability.
Utilizes self-attention to preserve feature information in orthogonal spaces.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Recurrent local orthogonal cost volume representation
Self-attention for feature information propagation
Radius-distribution multi-scale lookup strategy
🔎 Similar Papers
No similar papers found.