🤖 AI Summary
Current autonomous drones suffer from limited command understanding, real-time navigation capability, and environmental adaptability—particularly constrained by high latency and excessive power consumption. This paper proposes a real-time neuromorphic navigation framework tailored for edge deployment, achieving, for the first time, deep integration of large language models (LLMs) and spiking neural networks (SNNs) to form a closed-loop system spanning speech/text command interpretation, event-camera-based perception, and physics-driven motion planning. Deployed on the Parrot Bebop2 platform, the framework enables dynamic obstacle avoidance, semantic navigation, and adaptive decision-making, with end-to-end latency under 80 ms and 62% reduced power consumption. Key contributions include: (1) an LLM–SNN co-architectural design that jointly leverages high-level semantic comprehension and the ultra-low-latency, energy-efficient properties of spike-based computation; and (2) tight coupling of event-camera sensing with physics-informed motion planning, significantly enhancing real-time responsiveness and robustness in complex, dynamic environments.
📝 Abstract
The integration of human-intuitive interactions into autonomous systems has been limited. Traditional Natural Language Processing (NLP) systems struggle with context and intent understanding, severely restricting human-robot interaction. Recent advancements in Large Language Models (LLMs) have transformed this dynamic, allowing for intuitive and high-level communication through speech and text, and bridging the gap between human commands and robotic actions. Additionally, autonomous navigation has emerged as a central focus in robotics research, with artificial intelligence (AI) increasingly being leveraged to enhance these systems. However, existing AI-based navigation algorithms face significant challenges in latency-critical tasks where rapid decision-making is critical. Traditional frame-based vision systems, while effective for high-level decision-making, suffer from high energy consumption and latency, limiting their applicability in real-time scenarios. Neuromorphic vision systems, combining event-based cameras and spiking neural networks (SNNs), offer a promising alternative by enabling energy-efficient, low-latency navigation. Despite their potential, real-world implementations of these systems, particularly on physical platforms such as drones, remain scarce. In this work, we present Neuro-LIFT, a real-time neuromorphic navigation framework implemented on a Parrot Bebop2 quadrotor. Leveraging an LLM for natural language processing, Neuro-LIFT translates human speech into high-level planning commands which are then autonomously executed using event-based neuromorphic vision and physics-driven planning. Our framework demonstrates its capabilities in navigating in a dynamic environment, avoiding obstacles, and adapting to human instructions in real-time.