Overcoming Knowledge Barriers: Online Imitation Learning from Visual Observation with Pretrained World Models

📅 2024-04-29

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses two fundamental knowledge barriers hindering pre-trained models in vision-based imitation learning from observation (ILfO): the embodied knowledge barrier (EKB) and the demonstration knowledge barrier (DKB). To this end, we propose AIME-NoB, an online interactive action-inference framework. AIME-NoB is the first method to systematically decouple and model both EKB and DKB. It integrates pre-trained world-model-based action inference (AIME), data-driven regularization, proxy reward modeling, and online policy adaptation—jointly optimizing state coverage expansion and policy fine-tuning. Evaluated on the DeepMind Control Suite and Meta-World benchmarks, AIME-NoB achieves significant improvements in sample efficiency and convergence speed over existing ILfO approaches. The framework demonstrates robust generalization across diverse tasks and environments. All code is publicly available.

Technology Category

Application Category

📝 Abstract

Pretraining and finetuning models has become increasingly popular in decision-making. But there are still serious impediments in Imitation Learning from Observation (ILfO) with pretrained models. This study identifies two primary obstacles: the Embodiment Knowledge Barrier (EKB) and the Demonstration Knowledge Barrier (DKB). The EKB emerges due to the pretrained models' limitations in handling novel observations, which leads to inaccurate action inference. Conversely, the DKB stems from the reliance on limited demonstration datasets, restricting the model's adaptability across diverse scenarios. We propose separate solutions to overcome each barrier and apply them to Action Inference by Maximising Evidence (AIME), a state-of-the-art algorithm. This new algorithm, AIME-NoB, integrates online interactions and a data-driven regulariser to mitigate the EKB. Additionally, it uses a surrogate reward function to broaden the policy's supported states, addressing the DKB. Our experiments on vision-based control tasks from the DeepMind Control Suite and MetaWorld benchmarks show that AIME-NoB significantly improves sample efficiency and converged performance, presenting a robust framework for overcoming the challenges in ILfO with pretrained models. Code available at https://github.com/IcarusWizard/AIME-NoB.

Problem

Research questions and friction points this paper is trying to address.

Overcoming Embodiment Knowledge Barrier in pretrained ILfO models

Addressing Demonstration Knowledge Barrier from limited datasets

Improving sample efficiency in vision-based control tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Online interactions mitigate Embodiment Knowledge Barrier

Data-driven regulariser enhances action inference accuracy

Surrogate reward function broadens policy adaptability

🔎 Similar Papers

No similar papers found.

Authors to Follow