StixelNExT: Toward Monocular Low-Weight Perception for Object Segmentation and Free Space Detection

📅 2024-06-02
🏛️ 2024 IEEE Intelligent Vehicles Symposium (IV)
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses unsupervised, label-free generic object segmentation and free-space detection from monocular images. Methodologically, it introduces a LiDAR-distillation-driven self-supervised Stixel-World representation learning framework: (i) it pioneers LiDAR-guided Stixel ground-truth generation and knowledge distillation; (ii) it designs a multi-layer 2D Stixel-World direct prediction architecture enabling instance-level localization of overlapping objects; and (iii) it employs a lightweight CNN with implicit monocular depth modeling for LiDAR-free inference. Contributions include: (1) the first mid-level Stixel semantic representation jointly modeling free space and instance segmentation; (2) rapid scene adaptation using only a small set of unlabeled images; and (3) state-of-the-art accuracy on benchmarks such as KITTI, with significantly reduced model parameters and real-time inference speed.

Technology Category

Application Category

📝 Abstract
In this work, we present a novel approach for general object segmentation from a monocular image, eliminating the need for manually labeled training data and enabling rapid, straightforward training and adaptation with minimal data. Our model initially learns from LiDAR during the training process, which is subsequently removed from the system, allowing it to function solely on monocular imagery. This study leverages the concept of the Stixel-World to recognize a medium level representation of its surroundings. Our network directly predicts a 2D multi-layer Stixel-World and is capable of recognizing and locating multiple, superimposed objects within an image. Due to the scarcity of comparable works, we have divided the capabilities into modules and present a free space detection in our experiments section. Furthermore, we introduce an improved method for generating Stixels from LiDAR data, which we use as ground truth for our network.
Problem

Research questions and friction points this paper is trying to address.

Monocular image object segmentation without manual labels
Stixel-World based 2D multi-layer object recognition
Improved LiDAR-to-Stixel ground truth generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Monocular image segmentation without manual labels
LiDAR-trained model functions on monocular images
Improved Stixel generation from LiDAR data
🔎 Similar Papers
No similar papers found.
M
Marcel Vosshans
Institute for Intelligent Systems, Faculty of Computer Science and Engineering, University of Applied Sciences Esslingen, Germany
O
Omar Ait-Aider
Institut Pascal ISPR (Image, Systems of Perception, Robotics), Universite Clermont Auvergne INP / CNRS, France
Y
Y. Mezouar
Institut Pascal ISPR (Image, Systems of Perception, Robotics), Universite Clermont Auvergne INP / CNRS, France
Markus Enzweiler
Markus Enzweiler
Professor of Computer Science, Esslingen University of Applied Sciences
Autonomous SystemsScene UnderstandingDeep LearningSelf-Driving