DiMA: An LLM-Powered Ride-Hailing Assistant at DiDi

๐Ÿ“… 2025-02-12
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Natural interaction in dynamic, complex urban spatiotemporal environments poses significant challenges for on-demand ride-hailing services. Method: This paper proposes a conversational assistant framework tailored to ride-hailing scenarios. It introduces a novel spatiotemporally aware order-planning module, a cost-constrained multi-strategy dialogue system, and a continual learning framework integrating external tool invocation, hierarchical LLM configuration, multimodal response generators, and reinforcement-based human preference alignmentโ€”enabling joint fine-tuning on real-world and synthetic interaction data. Contribution/Results: Experiments show 93% accuracy in online order planning and 92% in response generation. Against offline SOTA baselines, key metrics improve by up to 70.23% and 321.27%, respectively, while end-to-end latency decreases by 0.72ร—โ€“5.47ร—. This work establishes a scalable technical paradigm for spatiotemporally grounded dialogue systems.

Technology Category

Application Category

๐Ÿ“ Abstract
On-demand ride-hailing services like DiDi, Uber, and Lyft have transformed urban transportation, offering unmatched convenience and flexibility. In this paper, we introduce DiMA, an LLM-powered ride-hailing assistant deployed in DiDi Chuxing. Its goal is to provide seamless ride-hailing services and beyond through a natural and efficient conversational interface under dynamic and complex spatiotemporal urban contexts. To achieve this, we propose a spatiotemporal-aware order planning module that leverages external tools for precise spatiotemporal reasoning and progressive order planning. Additionally, we develop a cost-effective dialogue system that integrates multi-type dialog repliers with cost-aware LLM configurations to handle diverse conversation goals and trade-off response quality and latency. Furthermore, we introduce a continual fine-tuning scheme that utilizes real-world interactions and simulated dialogues to align the assistant's behavior with human preferred decision-making processes. Since its deployment in the DiDi application, DiMA has demonstrated exceptional performance, achieving 93% accuracy in order planning and 92% in response generation during real-world interactions. Offline experiments further validate DiMA capabilities, showing improvements of up to 70.23% in order planning and 321.27% in response generation compared to three state-of-the-art agent frameworks, while reducing latency by $0.72 imes$ to $5.47 imes$. These results establish DiMA as an effective, efficient, and intelligent mobile assistant for ride-hailing services.
Problem

Research questions and friction points this paper is trying to address.

Develops a spatiotemporal-aware order planning module for ride-hailing
Creates a cost-effective dialogue system for diverse conversation goals
Introduces continual fine-tuning to align with human decision-making
Innovation

Methods, ideas, or system contributions that make the work stand out.

Spatiotemporal-aware order planning module
Cost-effective multi-type dialogue system
Continual fine-tuning with real-world data
๐Ÿ”Ž Similar Papers
No similar papers found.
Yansong NING
Yansong NING
The Hong Kong University of Science and Technology (Guangzhou)
LLM reasoningagentknowledge graphurban computing
Shuowei Cai
Shuowei Cai
HKUST(GZ)
Federated Learning.
W
Wei Li
Didichuxing Co. Ltd
J
Jun Fang
Didichuxing Co. Ltd
N
Naiqiang Tan
Didichuxing Co. Ltd
H
Hua Chai
Didichuxing Co. Ltd
H
Hao Liu
The Hong Kong University of Science and Technology (Guangzhou)