When Digital Twins Meet Large Language Models: Realistic, Interactive, and Editable Simulation for Autonomous Driving

📅 2025-06-30

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Existing autonomous driving simulation frameworks struggle to simultaneously achieve high-fidelity vehicle dynamics, photorealistic rendering, context-aware scenario orchestration, and real-time performance. This paper proposes the first high-fidelity digital twin framework integrating physics-based modeling, neural rendering, 3D reconstruction, and a large language model (LLM) interface—marking the inaugural integration of LLMs into autonomous driving simulation. The framework enables natural-language-driven, semantic-level scenario generation and online editing. It achieves real-to-simulation geometric and visual fidelity with a structural similarity index of 97%, sustains simulation rates exceeding 60 Hz, attains a 95% reproducibility rate for natural-language-based scenario generation, and demonstrates 85% cross-scenario generalization capability. By unifying physical accuracy, perceptual realism, linguistic controllability, and computational efficiency, the framework establishes a breakthrough trade-off among fidelity, interactivity, and real-time performance.

Technology Category

Application Category

📝 Abstract

Simulation frameworks have been key enablers for the development and validation of autonomous driving systems. However, existing methods struggle to comprehensively address the autonomy-oriented requirements of balancing: (i) dynamical fidelity, (ii) photorealistic rendering, (iii) context-relevant scenario orchestration, and (iv) real-time performance. To address these limitations, we present a unified framework for creating and curating high-fidelity digital twins to accelerate advancements in autonomous driving research. Our framework leverages a mix of physics-based and data-driven techniques for developing and simulating digital twins of autonomous vehicles and their operating environments. It is capable of reconstructing real-world scenes and assets (real2sim) with geometric and photorealistic accuracy and infusing them with various physical properties to enable real-time dynamical simulation of the ensuing driving scenarios. Additionally, it also incorporates a large language model (LLM) interface to flexibly edit the driving scenarios online via natural language prompts. We analyze the presented framework in terms of its fidelity, performance, and serviceability. Results indicate that our framework can reconstruct 3D scenes and assets with up to 97% structural similarity, while maintaining frame rates above 60 Hz. We also demonstrate that it can handle natural language prompts to generate diverse driving scenarios with up to 95% repeatability and 85% generalizability.

Problem

Research questions and friction points this paper is trying to address.

Balancing dynamical fidelity and photorealistic rendering in autonomous driving simulations

Enabling real-time, context-relevant scenario orchestration for autonomous vehicles

Integrating LLMs for editable simulations via natural language prompts

Innovation

Methods, ideas, or system contributions that make the work stand out.

Physics-based and data-driven digital twin creation

Real-time photorealistic scene reconstruction

LLM-enabled natural language scenario editing

🔎 Similar Papers

No similar papers found.

Authors to Follow