TartanGround: A Large-Scale Dataset for Ground Robot Perception and Navigation

📅 2025-05-15

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Existing ground robot datasets exhibit poor generalization, limiting perception and navigation in complex environments. To address this, we introduce the first large-scale, multimodal simulation dataset specifically designed for this task—encompassing 70 diverse scenarios, 910 trajectories, and 1.5 million samples, with full-stack sensor modalities including RGB stereo, depth, optical flow, LiDAR, semantic segmentation, and occupancy grids. We propose an integrated automated data acquisition pipeline, enabling, for the first time, joint modeling of wheeled and legged robots; leveraging a high-fidelity simulation engine, it supports synchronized multi-sensor capture, autonomous trajectory generation, and automatic semantic/occupancy map construction. Comprehensive experiments expose critical cross-scene generalization bottlenecks of state-of-the-art methods, establishing a new benchmark. Our dataset significantly improves cross-environment performance for occupancy prediction and SLAM models, providing a reproducible, scalable benchmark platform for neural scene representation and learning-based navigation.

Technology Category

Application Category

📝 Abstract

We present TartanGround, a large-scale, multi-modal dataset to advance the perception and autonomy of ground robots operating in diverse environments. This dataset, collected in various photorealistic simulation environments includes multiple RGB stereo cameras for 360-degree coverage, along with depth, optical flow, stereo disparity, LiDAR point clouds, ground truth poses, semantic segmented images, and occupancy maps with semantic labels. Data is collected using an integrated automatic pipeline, which generates trajectories mimicking the motion patterns of various ground robot platforms, including wheeled and legged robots. We collect 910 trajectories across 70 environments, resulting in 1.5 million samples. Evaluations on occupancy prediction and SLAM tasks reveal that state-of-the-art methods trained on existing datasets struggle to generalize across diverse scenes. TartanGround can serve as a testbed for training and evaluation of a broad range of learning-based tasks, including occupancy prediction, SLAM, neural scene representation, perception-based navigation, and more, enabling advancements in robotic perception and autonomy towards achieving robust models generalizable to more diverse scenarios. The dataset and codebase for data collection will be made publicly available upon acceptance. Webpage: https://tartanair.org/tartanground

Problem

Research questions and friction points this paper is trying to address.

Advancing ground robot perception in diverse environments

Addressing generalization issues in occupancy prediction and SLAM

Providing a multi-modal dataset for learning-based robotic tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-modal dataset with diverse sensor data

Integrated automatic pipeline for trajectory generation

Supports training for various learning-based robotics tasks

🔎 Similar Papers

No similar papers found.

Authors to Follow