DiMA: An LLM-Powered Ride-Hailing Assistant at DiDi

📅 2025-02-12

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

164K/year

🤖 AI Summary

Natural interaction in dynamic, complex urban spatiotemporal environments poses significant challenges for on-demand ride-hailing services. Method: This paper proposes a conversational assistant framework tailored to ride-hailing scenarios. It introduces a novel spatiotemporally aware order-planning module, a cost-constrained multi-strategy dialogue system, and a continual learning framework integrating external tool invocation, hierarchical LLM configuration, multimodal response generators, and reinforcement-based human preference alignment—enabling joint fine-tuning on real-world and synthetic interaction data. Contribution/Results: Experiments show 93% accuracy in online order planning and 92% in response generation. Against offline SOTA baselines, key metrics improve by up to 70.23% and 321.27%, respectively, while end-to-end latency decreases by 0.72×–5.47×. This work establishes a scalable technical paradigm for spatiotemporally grounded dialogue systems.

Technology Category

Application Category

📝 Abstract

On-demand ride-hailing services like DiDi, Uber, and Lyft have transformed urban transportation, offering unmatched convenience and flexibility. In this paper, we introduce DiMA, an LLM-powered ride-hailing assistant deployed in DiDi Chuxing. Its goal is to provide seamless ride-hailing services and beyond through a natural and efficient conversational interface under dynamic and complex spatiotemporal urban contexts. To achieve this, we propose a spatiotemporal-aware order planning module that leverages external tools for precise spatiotemporal reasoning and progressive order planning. Additionally, we develop a cost-effective dialogue system that integrates multi-type dialog repliers with cost-aware LLM configurations to handle diverse conversation goals and trade-off response quality and latency. Furthermore, we introduce a continual fine-tuning scheme that utilizes real-world interactions and simulated dialogues to align the assistant's behavior with human preferred decision-making processes. Since its deployment in the DiDi application, DiMA has demonstrated exceptional performance, achieving 93% accuracy in order planning and 92% in response generation during real-world interactions. Offline experiments further validate DiMA capabilities, showing improvements of up to 70.23% in order planning and 321.27% in response generation compared to three state-of-the-art agent frameworks, while reducing latency by $0.72 imes$ to $5.47 imes$. These results establish DiMA as an effective, efficient, and intelligent mobile assistant for ride-hailing services.

Problem

Research questions and friction points this paper is trying to address.

Develops a spatiotemporal-aware order planning module for ride-hailing

Creates a cost-effective dialogue system for diverse conversation goals

Introduces continual fine-tuning to align with human decision-making

Innovation

Methods, ideas, or system contributions that make the work stand out.

Spatiotemporal-aware order planning module

Cost-effective multi-type dialogue system

Continual fine-tuning with real-world data

🔎 Similar Papers

TraveLLM: Could you plan my new public transit route in face of a network disruption?