🤖 AI Summary
Accurate scene flow estimation at long ranges (>100 m) remains challenging in autonomous driving due to computational intractability of dense-grid-based methods and insufficient feature representation and motion modeling for sparse, distant LiDAR points.
Method: We propose the first end-to-end scene flow framework built upon sparse convolution. It introduces a novel sparse feature fusion mechanism and a virtual voxel alignment strategy, coupled with a range-aware metric and a range-weighted loss function to explicitly enhance robustness for long-range motion estimation.
Contribution/Results: Evaluated on Argoverse2, our method achieves state-of-the-art performance for scene flow estimation beyond 100 meters—significantly improving accuracy in far-field regions. It establishes a scalable, high-precision paradigm for long-range 3D dynamic scene understanding, overcoming key limitations of prior dense and sparse approaches.
📝 Abstract
Scene flow enables an understanding of the motion characteristics of the environment in the 3D world. It gains particular significance in the long-range, where object-based perception methods might fail due to sparse observations far away. Although significant advancements have been made in scene flow pipelines to handle large-scale point clouds, a gap remains in scalability with respect to long-range. We attribute this limitation to the common design choice of using dense feature grids, which scale quadratically with range. In this paper, we propose Sparse Scene Flow (SSF), a general pipeline for long-range scene flow, adopting a sparse convolution based backbone for feature extraction. This approach introduces a new challenge: a mismatch in size and ordering of sparse feature maps between time-sequential point scans. To address this, we propose a sparse feature fusion scheme, that augments the feature maps with virtual voxels at missing locations. Additionally, we propose a range-wise metric that implicitly gives greater importance to faraway points. Our method, SSF, achieves state-of-the-art results on the Argoverse2 dataset, demonstrating strong performance in long-range scene flow estimation. Our code will be released at https://github.com/KTH-RPL/SSF.git.