Infra-Bayesian Reinforcement Learning Agents Outperform Classical RL For Worst-Case Robustness

📅 2026-05-21

📈 Citations: 0

✨ Influential: 0

career value

217K/year

🤖 AI Summary

Classical reinforcement learning often suffers from unbounded regret in non-realizable environments due to model misspecification and Knightian uncertainty. This work proposes the first Infra-Bayesian reinforcement learning framework that operates under state-independent, finite-outcome settings. By maintaining a set of imprecise hypotheses, performing Infra-Bayesian conditional updates, and maximizing worst-case expected utility, the agent makes robust decisions while explicitly distinguishing between probabilistic and Knightian uncertainty. This explicit treatment significantly enhances robustness under model misspecification. Empirical results demonstrate that the proposed approach achieves lower worst-case regret in environments featuring Knightian uncertainty and selects the optimal policy in Newcomb-like decision problems.

📝 Abstract

Classical reinforcement learning assumes the agent interacts with a fixed environment whose behavior does not depend on the agent's policy. This assumption breaks down in non-realizable settings where other actors might anticipate the agent's behavior, including environments crucial to AI safety, where the agent interacts with predictors, humans, other AI agents, and institutions. In such settings, the agent's model class fails to capture the world in which it operates. Under such misspecification, classical Bayesian methods can produce confidently wrong posteriors, unreliable decisions, and unbounded regret, as realizability fails to obtain. Infra-Bayesianism is a decision-theoretic framework that addresses these failures by distinguishing ordinary probabilistic uncertainty, where priors can be reasonably chosen, from Knightian uncertainty, where no grounds exist for the construction of such a prior. It does so by evaluating actions on their worst-case outcomes, rather than from posterior expectations or weighted averaging. We present the first proof-of-concept implementation of an infra-Bayesian reinforcement learning architecture for finite-outcome stateless decision problems. Our agent maintains a set of imprecise hypotheses, updates them using infra-Bayesian conditioning, and selects actions by maximizing worst-case expected value. We apply this implementation of the infra-Bayesian maximin decision process to an environment with Knightian uncertainty, and demonstrate a lower worst-case regret as compared to classical reinforcement learning agents. We also investigate Newcomb's problem and show that the infra-Bayesian agent picks the optimal strategy, outperforming classical decision theory agents. Our results provide a step towards reinforcement learning agents that remain robust under model misspecification and policy-dependent uncertainty.

Problem

Research questions and friction points this paper is trying to address.

non-realizable settings

model misspecification

Knightian uncertainty

worst-case robustness

policy-dependent environment

Innovation

Methods, ideas, or system contributions that make the work stand out.

Infra-Bayesianism

Knightian uncertainty

model misspecification