LLM-MCoX: Large Language Model-based Multi-robot Coordinated Exploration and Search

📅 2025-09-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address weak coordination and insufficient semantic understanding in multi-robot collaborative exploration and target search within unknown indoor environments, this paper proposes the first integrated framework combining LiDAR perception with multimodal large language models (MLLMs). The method employs real-time front-end mapping, frontier clustering, doorway detection, and joint reasoning with GPT-4o to construct a shared semantic map and a state-driven task allocation mechanism, enabling natural-language instruction parsing and cross-modal environmental reasoning. Crucially, it pioneers the use of MLLMs for high-level path planning and semantic search decision-making—overcoming inherent limitations of conventional greedy and Voronoi-based strategies in dynamic collaboration and semantic abstraction. Experiments in a six-robot scenario demonstrate a 22.7% improvement in exploration speed and a 50% increase in target search efficiency, significantly enhancing scalability, coordination robustness, and practical utility for large-scale heterogeneous robot teams.

Technology Category

Application Category

📝 Abstract
Autonomous exploration and object search in unknown indoor environments remain challenging for multi-robot systems (MRS). Traditional approaches often rely on greedy frontier assignment strategies with limited inter-robot coordination. In this work, we introduce LLM-MCoX (LLM-based Multi-robot Coordinated Exploration and Search), a novel framework that leverages Large Language Models (LLMs) for intelligent coordination of both homogeneous and heterogeneous robot teams tasked with efficient exploration and target object search. Our approach combines real-time LiDAR scan processing for frontier cluster extraction and doorway detection with multimodal LLM reasoning (e.g., GPT-4o) to generate coordinated waypoint assignments based on shared environment maps and robot states. LLM-MCoX demonstrates superior performance compared to existing methods, including greedy and Voronoi-based planners, achieving 22.7% faster exploration times and 50% improved search efficiency in large environments with 6 robots. Notably, LLM-MCoX enables natural language-based object search capabilities, allowing human operators to provide high-level semantic guidance that traditional algorithms cannot interpret.
Problem

Research questions and friction points this paper is trying to address.

Enables multi-robot coordinated exploration in unknown indoor environments
Improves object search efficiency using Large Language Model reasoning
Allows natural language guidance for semantic search and coordination
Innovation

Methods, ideas, or system contributions that make the work stand out.

Leverages Large Language Models for multi-robot coordination
Combines LiDAR processing with multimodal LLM reasoning
Enables natural language-based object search capabilities
🔎 Similar Papers
No similar papers found.
R
Ruiyang Wang
H
Hao-Lun Hsu
David Hunt
David Hunt
S
Shaocheng Luo
Jiwoo Kim
Jiwoo Kim
성균관대학교 인공지능학과
M
Miroslav Pajic