Difference-Aware Retrieval Policies for Imitation Learning

📅 2026-06-08

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Behavioral cloning suffers from error accumulation during deployment due to out-of-distribution states and limited generalization. This work proposes a semi-parametric, retrieval-based imitation learning approach that reparameterizes policy modeling as a local neighborhood structure: by retrieving the k nearest neighbor states from expert demonstrations along with their corresponding actions, and predicting actions using relative distance vectors between states. The method requires no additional data, online feedback, or task-specific priors. Evaluated on continuous control and robotic manipulation tasks, it substantially outperforms standard behavioral cloning, achieving performance gains of 15%–46%, and is compatible with diverse state representations, including high-dimensional visual inputs.

📝 Abstract

Parametric imitation learning via behavior cloning can suffer from poor generalization to out-of-distribution states due to compounding errors during deployment. We show that reusing the training data during inference via a semi-parametric retrieval-based imitation learning approach can alleviate this challenge. We present Difference-Aware Retrieval Policies for Imitation Learning (DARP), a semi-parametric retrieval-based imitation learning approach that addresses this limitation by reparameterizing the imitation learning problem in terms of local neighborhood structure rather than direct state-to-action mappings. Instead of learning a global policy, DARP trains a model to predict actions based on $k$-nearest neighbors from expert demonstrations, their corresponding actions, and the relative distance vectors between neighbor states and query states. DARP requires no additional assumptions beyond those made for standard behavior cloning -- it does not require additional data collection, online expert feedback, or task-specific knowledge. We demonstrate consistent performance improvements of 15-46% over standard behavior cloning across diverse domains, including continuous control and robotic manipulation, and across different representations, including high-dimensional visual features. Code and demos are available at https://weirdlabuw.github.io/darp-site/.

Problem

Research questions and friction points this paper is trying to address.

imitation learning

out-of-distribution generalization

compounding errors

behavior cloning

generalization

Innovation

Methods, ideas, or system contributions that make the work stand out.

retrieval-based imitation learning

semi-parametric policy

k-nearest neighbors