A Unifying Framework for Causal Imitation Learning with Hidden Confounders

📅 2025-02-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses causal imitation learning (Causal IL) under hidden confounding, unifying two types of confounders: (i) expert-observable but learner-unobservable latent variables, and (ii) time-varying confounding noise unobserved by both expert and learner. To tackle this, we propose the first unified Causal IL framework accommodating multiple confounding settings. Our method innovatively leverages trajectory history as an instrumental variable to formulate a conditional moment restriction (CMR) estimation paradigm, integrated with double machine learning for robust policy estimation. We derive a theoretically grounded upper bound on the imitation error, provably accounting for confounding bias. Experiments on continuous control benchmarks—including MuJoCo—demonstrate that our approach significantly outperforms existing Causal IL methods, validating both its theoretical guarantees and generalization capability.

Technology Category

Application Category

📝 Abstract
We propose a general and unifying framework for causal Imitation Learning (IL) with hidden confounders that subsumes several existing confounded IL settings from the literature. Our framework accounts for two types of hidden confounders: (a) those observed by the expert, which thus influence the expert's policy, and (b) confounding noise hidden to both the expert and the IL algorithm. For additional flexibility, we also introduce a confounding noise horizon and time-varying expert-observable hidden variables. We show that causal IL in our framework can be reduced to a set of Conditional Moment Restrictions (CMRs) by leveraging trajectory histories as instruments to learn a history-dependent policy. We propose DML-IL, a novel algorithm that uses instrumental variable regression to solve these CMRs and learn a policy. We provide a bound on the imitation gap for DML-IL, which recovers prior results as special cases. Empirical evaluation on a toy environment with continues state-action spaces and multiple Mujoco tasks demonstrate that DML-IL outperforms state-of-the-art causal IL algorithms.
Problem

Research questions and friction points this paper is trying to address.

Causal Imitation Learning
Hidden Confounders
Conditional Moment Restrictions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Causal Imitation Learning framework
Instrumental variable regression
Conditional Moment Restrictions
🔎 Similar Papers
No similar papers found.