🤖 AI Summary
This work addresses the problem of high-precision estimation of vehicle kinematic states—including ego-vehicle speed, yaw angle, and relative distance/speed to the leading vehicle—from dashcam video. We propose a lightweight neural network that fuses time-synchronized CAN-bus dynamical measurements with consumer-grade dashcam video in an end-to-end prediction framework. To ensure reproducibility, we design an open-source, multi-sensor data acquisition and annotation pipeline with rigorous temporal alignment. Evaluated on 18 hours of real-world driving data, our model significantly outperforms vision-only baselines: mean absolute errors in speed and relative distance estimation are below 0.5 m/s and 0.8 m, respectively. The approach offers a reliable, deployable solution for low-cost perception enhancement in autonomous driving systems and high-fidelity simulation scenario generation.
📝 Abstract
The goal of this paper is to explore the accuracy of dashcam footage to predict the actual kinematic motion of a car-like vehicle. Our approach uses ground truth information from the vehicle's on-board data stream, through the controller area network, and a time-synchronized dashboard camera, mounted to a consumer-grade vehicle, for 18 hours of footage and driving. The contributions of the paper include neural network models that allow us to quantify the accuracy of predicting the vehicle speed and yaw, as well as the presence of a lead vehicle, and its relative distance and speed. In addition, the paper describes how other researchers can gather their own data to perform similar experiments, using open-source tools and off-the-shelf technology.