Urban-R1: Reinforced MLLMs Mitigate Geospatial Biases for Urban General Intelligence

📅 2025-10-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing urban foundation models suffer from significant geospatial bias, leading to imbalanced regional predictions and poor generalization—hindering Urban General Intelligence (UGI)’s cross-domain understanding and reasoning in complex urban environments. To address this, we propose Urban-R1, the first reinforcement learning (RL) post-training framework for urban multimodal large language models (MLLMs). Urban-R1 innovatively integrates Grouped Relative Policy Optimization (GRPO) with a city-region profiling auxiliary task, explicitly modeling geographic heterogeneity to mitigate spatial bias. Leveraging diverse multi-source urban data and multimodal LLMs, Urban-R1 substantially improves fairness and out-of-distribution generalization across cross-regional urban understanding, planning, and reasoning tasks. It surpasses both supervised fine-tuning baselines and leading proprietary models on multiple benchmarks. Urban-R1 establishes a novel paradigm for building robust, generalizable UGI systems capable of equitable, context-aware urban intelligence.

Technology Category

Application Category

📝 Abstract
Rapid urbanization intensifies the demand for Urban General Intelligence (UGI), referring to AI systems that can understand and reason about complex urban environments. Recent studies have built urban foundation models using supervised fine-tuning (SFT) of LLMs and MLLMs, yet these models exhibit persistent geospatial bias, producing regionally skewed predictions and limited generalization. To this end, we propose Urban-R1, a reinforcement learning-based post-training framework that aligns MLLMs with the objectives of UGI. Urban-R1 adopts Group Relative Policy Optimization (GRPO) to optimize reasoning across geographic groups and employs urban region profiling as a proxy task to provide measurable rewards from multimodal urban data. Extensive experiments across diverse regions and tasks show that Urban-R1 effectively mitigates geo-bias and improves cross-region generalization, outperforming both SFT-trained and closed-source models. Our results highlight reinforcement learning alignment as a promising pathway toward equitable and trustworthy urban intelligence.
Problem

Research questions and friction points this paper is trying to address.

Mitigating geospatial biases in urban AI systems
Improving cross-region generalization of urban models
Aligning multimodal models with urban intelligence objectives
Innovation

Methods, ideas, or system contributions that make the work stand out.

Reinforcement learning framework reduces geospatial bias
Group Relative Policy Optimization improves cross-region reasoning
Urban region profiling provides measurable multimodal rewards
🔎 Similar Papers
No similar papers found.