Chameleon: Fast-slow Neuro-symbolic Lane Topology Extraction

📅 2025-03-10

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

To address the low efficiency and high computational cost of lane topology inference in high-definition-map-free autonomous driving, this paper proposes a fast-slow dual-system collaborative architecture. The fast system employs a procedural synthesis model to perform real-time, instance-level relational modeling among lanes and traffic elements, enabling efficient topological reasoning (e.g., left-turn feasibility). The slow system is triggered only on edge cases, invoking a vision-language model (VLM) augmented with chain-of-thought prompting and neuro-symbolic reasoning for fine-grained verification. By integrating VLMs, instance-level relation modeling, and interpretable reasoning mechanisms, our method achieves substantial improvements over multiple detector-based baselines on OpenLane-V2: +3.2% mAP while reducing VLM invocation frequency by 70%, thus balancing high accuracy and low computational overhead. All code and data are publicly released.

Technology Category

Application Category

📝 Abstract

Lane topology extraction involves detecting lanes and traffic elements and determining their relationships, a key perception task for mapless autonomous driving. This task requires complex reasoning, such as determining whether it is possible to turn left into a specific lane. To address this challenge, we introduce neuro-symbolic methods powered by vision-language foundation models (VLMs). Existing approaches have notable limitations: (1) Dense visual prompting with VLMs can achieve strong performance but is costly in terms of both financial resources and carbon footprint, making it impractical for robotics applications. (2) Neuro-symbolic reasoning methods for 3D scene understanding fail to integrate visual inputs when synthesizing programs, making them ineffective in handling complex corner cases. To this end, we propose a fast-slow neuro-symbolic lane topology extraction algorithm, named Chameleon, which alternates between a fast system that directly reasons over detected instances using synthesized programs and a slow system that utilizes a VLM with a chain-of-thought design to handle corner cases. Chameleon leverages the strengths of both approaches, providing an affordable solution while maintaining high performance. We evaluate the method on the OpenLane-V2 dataset, showing consistent improvements across various baseline detectors. Our code, data, and models are publicly available at https://github.com/XR-Lee/neural-symbolic

Problem

Research questions and friction points this paper is trying to address.

Extracts lane topology for mapless autonomous driving

Integrates neuro-symbolic reasoning with vision-language models

Balances cost and performance in lane detection

Innovation

Methods, ideas, or system contributions that make the work stand out.

Fast-slow neuro-symbolic lane topology extraction

Vision-language foundation models for complex reasoning

Chain-of-thought design for handling corner cases

🔎 Similar Papers

No similar papers found.

Authors to Follow