🤖 AI Summary
Embodied foundation models suffer from catastrophic forgetting in multi-task continual learning. To address this, we propose a gradient-free, statistics-driven analytical scheduling framework that enables language-instruction-driven dynamic model selection via a task-decoupled specialized model library and an online scheduler based on recursive least squares (RLS). The scheduler operates solely on incrementally updated auto- and cross-correlation matrices—requiring no historical data replay—and architecturally isolates parameter interference across tasks. Evaluated on the RM65B real-world robotic platform, our method substantially mitigates forgetting, supports millisecond-scale task switching, zero-shot instruction generalization, and plug-and-play task expansion. It achieves high scalability and real-time deployability without compromising performance.
📝 Abstract
Embodied foundation models are crucial for Artificial Intelligence (AI) interacting with the physical world by integrating multi-modal inputs, such as proprioception, vision and language, to understand human intentions and generate actions to control robots. While these models demonstrate strong generalization and few-shot learning capabilities, they face significant challenges in continually acquiring new skills without forgetting previously learned skills, a problem known as catastrophic forgetting. To address this issue, we propose the Analytic Task Scheduler (ATS), a novel framework for continual learning in embodied foundation models. ATS consists of a task-specific model library, where each model is fine-tuned independently on a single task, and an analytic scheduler trained using recursive least squares (RLS) to learn the mapping between language instructions and task-specific models. This architecture enables accurate task recognition and dynamic model selection while fundamentally avoiding parameter interference across tasks. The scheduler updates its parameters incrementally using only statistics (autocorrelation and cross-correlation matrices), enabling forgetting-resistant learning without the need to revisit historical data. We validate ATS on a real-world robot platform (RM65B), demonstrating superior resistance to forgetting and strong adaptability to task variations. The results highlight ATS as an effective, scalable, and deployable solution for continual learning in embodied foundation models operating in complex, dynamic environments. Our code will be available at https://github.com/MIAA-Embodied-AI/AnalyticTaskScheduler