๐ค AI Summary
Natural interaction in dynamic, complex urban spatiotemporal environments poses significant challenges for on-demand ride-hailing services.
Method: This paper proposes a conversational assistant framework tailored to ride-hailing scenarios. It introduces a novel spatiotemporally aware order-planning module, a cost-constrained multi-strategy dialogue system, and a continual learning framework integrating external tool invocation, hierarchical LLM configuration, multimodal response generators, and reinforcement-based human preference alignmentโenabling joint fine-tuning on real-world and synthetic interaction data.
Contribution/Results: Experiments show 93% accuracy in online order planning and 92% in response generation. Against offline SOTA baselines, key metrics improve by up to 70.23% and 321.27%, respectively, while end-to-end latency decreases by 0.72รโ5.47ร. This work establishes a scalable technical paradigm for spatiotemporally grounded dialogue systems.
๐ Abstract
On-demand ride-hailing services like DiDi, Uber, and Lyft have transformed urban transportation, offering unmatched convenience and flexibility. In this paper, we introduce DiMA, an LLM-powered ride-hailing assistant deployed in DiDi Chuxing. Its goal is to provide seamless ride-hailing services and beyond through a natural and efficient conversational interface under dynamic and complex spatiotemporal urban contexts. To achieve this, we propose a spatiotemporal-aware order planning module that leverages external tools for precise spatiotemporal reasoning and progressive order planning. Additionally, we develop a cost-effective dialogue system that integrates multi-type dialog repliers with cost-aware LLM configurations to handle diverse conversation goals and trade-off response quality and latency. Furthermore, we introduce a continual fine-tuning scheme that utilizes real-world interactions and simulated dialogues to align the assistant's behavior with human preferred decision-making processes. Since its deployment in the DiDi application, DiMA has demonstrated exceptional performance, achieving 93% accuracy in order planning and 92% in response generation during real-world interactions. Offline experiments further validate DiMA capabilities, showing improvements of up to 70.23% in order planning and 321.27% in response generation compared to three state-of-the-art agent frameworks, while reducing latency by $0.72 imes$ to $5.47 imes$. These results establish DiMA as an effective, efficient, and intelligent mobile assistant for ride-hailing services.