Rational Inverse Reasoning

📅 2025-08-12

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

This work addresses the challenge of cross-domain generalization for robotic imitation learning under sparse and imperfect demonstrations. Methodologically, it formulates demonstration interpretation as a Bayesian program induction problem: a vision-language model generates symbolic task hypotheses, which are refined via a hierarchical generative model and planner in a closed-loop inference process to jointly infer high-level goals, subtask structure, and execution constraints—yielding a posterior distribution over executable programs. The key contribution is the first method capable of automatically recovering the latent program logic from a single noisy demonstration, without fine-tuning or auxiliary examples. Experiments demonstrate that, given only one demonstration, the approach accurately reconstructs task structure across novel scenes with substantial variations in object pose, count, geometry, and spatial layout. It achieves significantly superior generalization performance compared to state-of-the-art baselines.

Technology Category

Application Category

📝 Abstract

Humans can observe a single, imperfect demonstration and immediately generalize to very different problem settings. Robots, in contrast, often require hundreds of examples and still struggle to generalize beyond the training conditions. We argue that this limitation arises from the inability to recover the latent explanations that underpin intelligent behavior, and that these explanations can take the form of structured programs consisting of high-level goals, sub-task decomposition, and execution constraints. In this work, we introduce Rational Inverse Reasoning (RIR), a framework for inferring these latent programs through a hierarchical generative model of behavior. RIR frames few-shot imitation as Bayesian program induction: a vision-language model iteratively proposes structured symbolic task hypotheses, while a planner-in-the-loop inference scheme scores each by the likelihood of the observed demonstration under that hypothesis. This loop yields a posterior over concise, executable programs. We evaluate RIR on a suite of continuous manipulation tasks designed to test one-shot and few-shot generalization across variations in object pose, count, geometry, and layout. With as little as one demonstration, RIR infers the intended task structure and generalizes to novel settings, outperforming state-of-the-art vision-language model baselines.

Problem

Research questions and friction points this paper is trying to address.

Robots struggle to generalize from few demonstrations

Recovering latent explanations for intelligent behavior is challenging

Inferring structured programs from imperfect demonstrations is needed

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical generative model for behavior

Bayesian program induction framework

Planner-in-the-loop inference scheme

🔎 Similar Papers

Semantic Self-Consistency: Enhancing Language Model Reasoning via Semantic Weighting