🤖 AI Summary
This work addresses imitation learning under extremely low-data regimes—only 20 expert trajectories. We propose NGT, a lightweight off-policy method that formulates imitation as a noise-guided optimal transport problem. NGT implicitly aligns the expert and agent policy distributions via adversarial training, eliminating the need for pretraining, large-scale models, or specialized architectures. Its core innovation is a controllable noise mechanism integrated into the optimal transport framework, enabling intrinsic uncertainty estimation and robust policy learning. NGT is optimized efficiently using only small neural networks, features straightforward hyperparameter tuning, and enables rapid deployment. On continuous-control benchmarks—including high-dimensional Humanoid—NGT significantly outperforms existing low-data imitation learning approaches. The implementation is publicly available.
📝 Abstract
We consider imitation learning in the low-data regime, where only a limited number of expert demonstrations are available. In this setting, methods that rely on large-scale pretraining or high-capacity architectures can be difficult to apply, and efficiency with respect to demonstration data becomes critical. We introduce Noise-Guided Transport (NGT), a lightweight off-policy method that casts imitation as an optimal transport problem solved via adversarial training. NGT requires no pretraining or specialized architectures, incorporates uncertainty estimation by design, and is easy to implement and tune. Despite its simplicity, NGT achieves strong performance on challenging continuous control tasks, including high-dimensional Humanoid tasks, under ultra-low data regimes with as few as 20 transitions. Code is publicly available at: https://github.com/lionelblonde/ngt-pytorch.