🤖 AI Summary
This work addresses the challenge that existing memory-augmented methods struggle to simultaneously achieve precise task-state tracking and robust long-term memory retention. Inspired by the Atkinson-Shiffrin memory model, we propose a hierarchical memory architecture for robotic policies that integrates human-like multi-level memory mechanisms. The architecture employs a lossless short-term memory buffer to ensure operational precision while leveraging a compressed long-term memory representation to maintain temporal consistency. Evaluated on our newly introduced MemoryRTBench benchmark, the proposed approach significantly outperforms Markovian baselines and state-of-the-art history-aware strategies in both simulated and real-world environments, demonstrating superior performance in task-state tracking and long-horizon memory-intensive tasks.
📝 Abstract
Memory-augmented robotic policies are essential in handling memory-dependent tasks. However, existing approaches typically rely on simple observation window extensions, struggling to simultaneously achieve precise task state tracking and robust long-horizon retention. To overcome these challenges, inspired by the Atkinson-Shiffrin memory model, we propose MemoAct, a hierarchical memory-based policy that leverages distinct memory tiers to tackle specific bottlenecks. Specifically, lossless short-term memory ensures precise task state tracking, while compressed long-term memory enables robust long-horizon retention. To enrich the evaluation landscape, we construct MemoryRTBench based on RoboTwin 2.0, specifically tailored to assess policy capabilities in task state tracking and long-horizon retention. Extensive experiments across simulated and real-world scenarios demonstrate that MemoAct achieves superior performance compared to both existing Markovian baselines and history-aware policies. The project page is \href{https://tlf-tlf.github.io/MemoActPage/}{available}.