Chameleon: Fast-slow Neuro-symbolic Lane Topology Extraction

📅 2025-03-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the low efficiency and high computational cost of lane topology inference in high-definition-map-free autonomous driving, this paper proposes a fast-slow dual-system collaborative architecture. The fast system employs a procedural synthesis model to perform real-time, instance-level relational modeling among lanes and traffic elements, enabling efficient topological reasoning (e.g., left-turn feasibility). The slow system is triggered only on edge cases, invoking a vision-language model (VLM) augmented with chain-of-thought prompting and neuro-symbolic reasoning for fine-grained verification. By integrating VLMs, instance-level relation modeling, and interpretable reasoning mechanisms, our method achieves substantial improvements over multiple detector-based baselines on OpenLane-V2: +3.2% mAP while reducing VLM invocation frequency by 70%, thus balancing high accuracy and low computational overhead. All code and data are publicly released.

Technology Category

Application Category

📝 Abstract
Lane topology extraction involves detecting lanes and traffic elements and determining their relationships, a key perception task for mapless autonomous driving. This task requires complex reasoning, such as determining whether it is possible to turn left into a specific lane. To address this challenge, we introduce neuro-symbolic methods powered by vision-language foundation models (VLMs). Existing approaches have notable limitations: (1) Dense visual prompting with VLMs can achieve strong performance but is costly in terms of both financial resources and carbon footprint, making it impractical for robotics applications. (2) Neuro-symbolic reasoning methods for 3D scene understanding fail to integrate visual inputs when synthesizing programs, making them ineffective in handling complex corner cases. To this end, we propose a fast-slow neuro-symbolic lane topology extraction algorithm, named Chameleon, which alternates between a fast system that directly reasons over detected instances using synthesized programs and a slow system that utilizes a VLM with a chain-of-thought design to handle corner cases. Chameleon leverages the strengths of both approaches, providing an affordable solution while maintaining high performance. We evaluate the method on the OpenLane-V2 dataset, showing consistent improvements across various baseline detectors. Our code, data, and models are publicly available at https://github.com/XR-Lee/neural-symbolic
Problem

Research questions and friction points this paper is trying to address.

Extracts lane topology for mapless autonomous driving
Integrates neuro-symbolic reasoning with vision-language models
Balances cost and performance in lane detection
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fast-slow neuro-symbolic lane topology extraction
Vision-language foundation models for complex reasoning
Chain-of-thought design for handling corner cases
🔎 Similar Papers
No similar papers found.
Z
Zongzheng Zhang
Institute for AI Industry Research (AIR), Tsinghua University, China; Bosch Corporate Research, China
X
Xinrun Li
Bosch Corporate Research, China
S
Sizhe Zou
Institute for AI Industry Research (AIR), Tsinghua University, China
Guoxuan Chi
Guoxuan Chi
Tsinghua University
Mobile ComputingWireless SensingSpatial Intelligence
S
Siqi Li
Institute for AI Industry Research (AIR), Tsinghua University, China
X
Xuchong Qiu
Bosch Corporate Research, China
G
Guoliang Wang
Institute for AI Industry Research (AIR), Tsinghua University, China
G
Guantian Zheng
Institute for AI Industry Research (AIR), Tsinghua University, China
L
Leichen Wang
Bosch Corporate Research, China
H
Hang Zhao
Institute for Interdisciplinary Information Sciences(IIIS), Tsinghua University, China
H
Hao Zhao
Institute for AI Industry Research (AIR), Tsinghua University, China