๐ค AI Summary
This work proposes a deep learningโbased robust monocular visual feature tracking method to address the significant performance degradation of traditional visual-inertial odometry (VIO) systems in challenging environments, such as those with low texture or drastic illumination changes, where handcrafted features often fail to track reliably. The proposed approach replaces conventional hand-engineered features in the VINS-Fusion framework with data-driven features designed to enhance trackability under adverse conditions. By leveraging learned representations, the method substantially improves the accuracy and robustness of state estimation in extreme scenarios. Experimental results demonstrate that the proposed system consistently outperforms existing VIO approaches under identical conditions, effectively overcoming the reliance of traditional methods on stable environmental texture and lighting.
๐ Abstract
SLAM (Simultaneous Localization and Mapping) and Odometry are important systems for estimating the position of mobile devices, such as robots and cars, utilizing one or more sensors. Particularly in camera-based SLAM or Odometry, effectively tracking visual features is important as it significantly impacts system performance. In this paper, we propose a method that leverages deep learning to robustly track visual features in monocular camera images. This method operates reliably even in textureless environments and situations with rapid lighting changes. Additionally, we evaluate the performance of our proposed method by integrating it into VINS-Fusion (Monocular-Inertial), a commonly used Visual-Inertial Odometry (VIO) system.