๐ค AI Summary
To address the joint requirements of accuracy, robustness, and real-time performance for autonomous driving motion prediction in complex traffic scenarios, this paper proposes the first lightweight motion prediction framework integrating large language models (LLMs) with chain-of-thought (CoT) prompting. Methodologically, it introduces: (1) a novel fine-tuning-free CoT semantic annotation generation paradigm that automatically produces high-quality traffic semantic labels; (2) knowledge distillation to transfer LLM-based scene understanding capabilities into an edge-deployable lightweight language model; and (3) Highway-Text and Urban-Textโthe first publicly available text datasets tailored for traffic scene description. Evaluated on five real-world benchmarks, our approach surpasses state-of-the-art methods, reducing prediction error by 12.6%โ23.4% while maintaining inference latency under 80 msโmeeting stringent real-time constraints for onboard edge devices.
๐ Abstract
Accurate motion forecasting is crucial for safe autonomous driving (AD). This study proposes CoT-Drive, a novel approach that enhances motion forecasting by leveraging large language models (LLMs) and a chain-of-thought (CoT) prompting method. We introduce a teacher-student knowledge distillation strategy to effectively transfer LLMs' advanced scene understanding capabilities to lightweight language models (LMs), ensuring that CoT-Drive operates in real-time on edge devices while maintaining comprehensive scene understanding and generalization capabilities. By leveraging CoT prompting techniques for LLMs without additional training, CoT-Drive generates semantic annotations that significantly improve the understanding of complex traffic environments, thereby boosting the accuracy and robustness of predictions. Additionally, we present two new scene description datasets, Highway-Text and Urban-Text, designed for fine-tuning lightweight LMs to generate context-specific semantic annotations. Comprehensive evaluations of five real-world datasets demonstrate that CoT-Drive outperforms existing models, highlighting its effectiveness and efficiency in handling complex traffic scenarios. Overall, this study is the first to consider the practical application of LLMs in this field. It pioneers the training and use of a lightweight LLM surrogate for motion forecasting, setting a new benchmark and showcasing the potential of integrating LLMs into AD systems.