π€ AI Summary
This work addresses the challenge of high inference latency in cloud-hosted large language models (LLMs), which impedes their applicability to real-time autonomous driving control. Existing world models often couple prediction and decision-making, further exacerbating response delays. To overcome this, the paper proposes a latency-decoupled planning-runtime architecture that leverages a βworldlineβ metaphor to structurally generate multimodal driving futures. The LLM pre-selects counterfactual strategies offline and reuses them for low-latency control within the validity window of a safety contract. A novel alpha/beta/gamma role mechanism enables typed strategic forecasting, while atomic predicate-based runtime safety checks replace drift scores for more precise validation. Experiments demonstrate that, under a 4-second planning horizon, the approach reduces effective latency from +3.07 seconds to β0.01 seconds while preserving collision-free safety margins.
π Abstract
Cloud-hosted LLM driver agents provide useful semantic judgments, but their inference latency exceeds stepwise vehicle-control windows. Learned world models predict futures, but they usually keep future generation and action selection inside large coupled loops. We present SteinsGateDrive, a latency-decoupled planner-runtime architecture in which the worldline metaphor from the eponymous story names one plausible consequence of an intervention: the LLM selects counterfactual driving futures before the final control instant, and a runtime reuses the selected forecast only while safety contracts remain valid. The generator builds three world-line roles: alpha nominal ego-conditioned futures, beta interaction counterfactuals around nearby vehicles, and gamma hazard-stress futures such as braking, cut-ins, or blocked corridors. The selected branch becomes a typed StrategicForecast with horizon, validity/abort conditions, fallback, and authority. On a within-subject, matched-seed normal-highway protocol with 10 seeds and 20 steps, GPT-5.4 mini reduces effective lag from +3.07 s at 1-second horizon to -0.01 s at 4-second horizon while preserving the measured no-collision safety boundary. The architecture's safety contribution comes from the atom-predicate runtime check, not from the drift score, which functions as a refresh-frequency knob.