Estimating Dynamic Marginal Policy Effects under Sequential Unconfoundedness

📅 2026-04-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of accurately estimating the long-term effects of infinitesimal policy changes in dynamic systems where the full state is unobserved and non-Markovian dependencies are present. Under a sequential unconfoundedness assumption, the authors propose an efficient approach to identify and estimate dynamic marginal policy effects (MPE) by deriving a reduced-form representation that avoids reliance on full state observability or the Markov property. They develop a doubly robust estimator that circumvents the exponential complexity typically associated with long time horizons. To the best of our knowledge, this is the first method to enable efficient and robust off-policy evaluation under general sequential unconfoundedness conditions. The effectiveness and robustness of the proposed approach are demonstrated through simulation studies and an application to dynamic pricing.
📝 Abstract
We develop methods for estimating how infinitesimal policy changes affect long-term outcomes in dynamic systems. We show that dynamic marginal policy effects (MPEs) can be identified via tractable reduced-form expressions, and can be estimated under a general sequential unconfoundedness assumption. We also propose a doubly robust estimator for dynamic MPEs. Our approach does not require observing full dynamic state information (as is typically assumed for off-policy evaluation in Markov decision processes), and does not incur an exponential curse of horizon (as is typical in non-Markovian off-policy evaluation). We demonstrate practicality and robustness of our approach in a number of simulations, including one motivated by a dynamic pricing application where people use past prices to form a reference level for current prices.
Problem

Research questions and friction points this paper is trying to address.

Dynamic Marginal Policy Effects
Sequential Unconfoundedness
Off-policy Evaluation
Dynamic Systems
Long-term Outcomes
Innovation

Methods, ideas, or system contributions that make the work stand out.

dynamic marginal policy effects
sequential unconfoundedness
doubly robust estimation
off-policy evaluation
curse of horizon