🤖 AI Summary
This work addresses the challenge of balancing semantic understanding and efficient exploration for micro aerial vehicles (MAVs) in complex, unstructured 3D environments, which critically impacts search-and-rescue performance. The authors propose a semantic-guided viewpoint planning framework that tightly integrates semantic reasoning with 3D exploration. Leveraging a large language model (LLM), the method generates semantic priors to assess target similarity and propagates semantic priorities to frontier voxels through active perception, computing semantic information gain to guide viewpoint selection. A compositional planner then produces efficient exploration trajectories. Experimental results in simulation demonstrate significant improvements over baseline approaches in rapidly locating targets while controlling exploration time. Real-world MAV trials further validate the framework’s practicality under constraints of limited battery life, narrow perceptual range, and semantic uncertainty.
📝 Abstract
Autonomous target search is crucial for deploying Micro Aerial Vehicles (MAVs) in emergency response and rescue missions. Existing approaches either focus on 2D semantic navigation in structured environments -- which is less effective in complex 3D settings, or on robotic exploration in cluttered spaces -- which often lacks the semantic reasoning needed for efficient target search. This paper overcomes these limitations by proposing a novel framework that utilizes a semantically-guided viewpoint planner to minimize target search and exploration time in unstructured 3D environments using an MAV. Specifically, we develop a combinatorial planner that generates efficient semantic exploration plans by prioritizing viewpoints that likely lead to the target. To guide the planner towards the target, an active perception pipeline is developed that propagates semantic priorities of observed objects into neighboring frontier voxels for computing semantic information gains of frontier viewpoints. In addition, we demonstrate how LLM-based similarity scores can be leveraged as semantic priority input to our pipeline. Evaluations in two distinct simulation environments show that the proposed method consistently outperforms baselines by quickly finding the target while maintaining reasonable exploration times. Real-world experiments with an MAV further demonstrate the method's ability to handle practical constraints like limited battery life, small sensor range, and semantic uncertainty.