LLM-Guided Probabilistic Program Induction for POMDP Model Estimation

📅 2025-05-04

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the automatic modeling of partially observable Markov decision processes (POMDPs) under structural uncertainty. We propose a learning framework that induces low-complexity probabilistic graphical model structures via large language models (LLMs) used as structural priors to guide search over probabilistic program space. Our method integrates empirical distribution validation, Bayesian model checking, and iterative feedback optimization to yield interpretable, compact, and executable POMDP specifications—including transition, observation, reward, and initial state distributions. It unifies LLM prompting, probabilistic program synthesis, and POMDP simulation. We evaluate the approach on MiniGrid benchmarks and a real-world mobile robot search task. Experiments demonstrate substantial improvements over tabular POMDP learning, behavioral cloning, and direct LLM-based planning—achieving superior modeling accuracy, generalization across environments, and deployability on resource-constrained robotic platforms.

Technology Category

Application Category

📝 Abstract

Partially Observable Markov Decision Processes (POMDPs) model decision making under uncertainty. While there are many approaches to approximately solving POMDPs, we aim to address the problem of learning such models. In particular, we are interested in a subclass of POMDPs wherein the components of the model, including the observation function, reward function, transition function, and initial state distribution function, can be modeled as low-complexity probabilistic graphical models in the form of a short probabilistic program. Our strategy to learn these programs uses an LLM as a prior, generating candidate probabilistic programs that are then tested against the empirical distribution and adjusted through feedback. We experiment on a number of classical toy POMDP problems, simulated MiniGrid domains, and two real mobile-base robotics search domains involving partial observability. Our results show that using an LLM to guide in the construction of a low-complexity POMDP model can be more effective than tabular POMDP learning, behavior cloning, or direct LLM planning.

Problem

Research questions and friction points this paper is trying to address.

Learning POMDP models with low-complexity probabilistic programs

Using LLMs as priors for generating candidate probabilistic programs

Improving POMDP model estimation over tabular learning and behavior cloning

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-guided probabilistic program induction

Low-complexity probabilistic graphical models

Feedback-adjusted candidate program testing

🔎 Similar Papers

No similar papers found.

Authors to Follow