On-Device Robotic Planning: Eliminating Inference Redundancy for Efficient Decision-Making

📅 2026-05-29

📈 Citations: 0

✨ Influential: 0

career value

184K/year

🤖 AI Summary

This work proposes REIS, a novel decision-making framework that addresses the high inference latency of current large language and vision-language models in robotic control, which hinders real-time deployment despite their strong semantic planning capabilities. REIS is the first to systematically identify and exploit temporal redundancy inherent in robotic reasoning. By introducing a lightweight scene gating mechanism and a key-value (KV)-guided actionable routing strategy, REIS dynamically skips redundant computations while preserving semantic adaptability. Integrated with edge-deployment optimizations, the framework substantially reduces inference overhead on the ALFRED benchmark and real-world robotic tasks, achieving competitive task success rates without sacrificing efficiency—thus establishing a new paradigm that effectively balances high performance with computational frugality.

📝 Abstract

Reasoning-based robotic policies using large language and vision-language models achieve strong semantic planning capabilities but mostly suffer from a high inference latency that limits practical real-time deployment. In this work, we observe that robotic reasoning workloads contain substantial temporal redundancy, where consecutive observations frequently produce identical actions and subgoals. Based on this insight, we present REIS, a human cognition inspired robotic decision-making framework that minimizes unnecessary reasoning while preserving semantic adaptability. REIS combines lightweight scene gating, KV-steered affordance routing, and deliberative reasoning to accelerate robotic control under embodied constraints. Experiments on ALFRED, and real-world robotic tasks demonstrate that REIS significantly suppresses reasoning overhead while maintaining competitive task performance.

Problem

Research questions and friction points this paper is trying to address.

robotic planning

inference latency

temporal redundancy

real-time deployment

reasoning overhead

Innovation

Methods, ideas, or system contributions that make the work stand out.

temporal redundancy

on-device planning

KV-steered affordance routing