🤖 AI Summary
This work addresses the challenge of auditing near-optimal policies in reinforcement learning, which, despite yielding similar returns, can exhibit markedly different behaviors. To characterize the complexity of the set of such policies, the authors introduce the notion of “occupancy Rashomon capacity,” integrating tools from information theory, metric entropy, and occupancy measure theory. They construct a hidden-branch MDP instance and develop a regularized auditing framework. Under settings of local policy queries and noisy sampling, they establish lower bounds of Ω(M/b) and Ω(M/β) on the number of required queries, respectively. In typical scenarios, the auditing complexity can scale as Ω(2^{H_opt^{cF(ε)}}), revealing an exponential dependence on problem parameters. The derived upper bounds align closely with empirical benchmarks, confirming the theoretical analysis.
📝 Abstract
When many reinforcement-learning policies achieve near-optimal return, a post-hoc auditor may have to distinguish among many behaviorally distinct but return-equivalent policies. We formalize this phenomenon through an occupancy-measure analogue of Rashomon capacity: the metric entropy of the near-optimal occupancy region, computed relative to an audited deployment class. Because occupancy measures identify behavior only up to occupancy equivalence, we formulate auditing at the occupancy-class level and distinguish exact local-query oracles from noisy sample-query oracles. Our main exact-query result is conditional: if the audited class contains a $2/H$-separated near-optimal packing whose local signatures are $b$-sparse, then exact local-query auditing requires $Ω(M/b)$ queries; when the packing realizes deployment-class capacity and $b=O(1)$, this becomes $Ω(2^{\Hopt^\cF(\eps)})$. We give a finite discounted hidden-branch MDP attaining this bound and show the exact Bayes success law. For noisy hidden-trigger testing, we prove a mixture lower bound of order $M/β$, where $β$ is the per-sample KL signal, yielding $Ω(2^{\Hopt^\cF(\eps)}/(ρ^2Δ^2))$ for capacity-order packings with $β=O(ρ^2Δ^2)$. We also provide a static target-recognition information lower bound, a transcript-compatible oracle-cover verification upper bound, and a canonical occupancy regularizer whose regularized audited capacity collapses when a trusted reference occupancy is available. Controlled benchmarks distinguish positive sparse-signature instances from high-capacity negative controls where exact auditing is easy, and map the noisy-trigger law to post-processed continuous-control and visual-RL auditing regimes.