Language Models For Generalised PDDL Planning: Synthesising Sound and Programmatic Policies

📅 2025-08-25

📈 Citations: 0

✨ Influential: 0

career value

185K/year

🤖 AI Summary

This work addresses the challenge of generating provably correct generalized policies—Python programs with formal correctness guarantees—as executable strategies within PDDL-defined world models, without requiring external verifiers. We propose LMPlan, a framework that leverages prompt engineering to guide language models (LMs) to directly synthesize formally verifiable policy programs, tightly integrating PDDL domain modeling with built-in formal correctness assurance mechanisms. Our key contribution is the first demonstration of strictly provably correct, programmatic policy generation that requires no external verification. Empirically, we find LMs achieve superior performance when processing symbolic PDDL inputs—a result that challenges the conventional assumption that LM success relies primarily on semantic understanding and memorization from training data. Under fixed computational resources, LMPlan significantly outperforms both classical PDDL planners and state-of-the-art LM-based approaches, scaling effectively to complex scenarios involving hundreds of objects.

Technology Category

Application Category

📝 Abstract

We study the usage of language models (LMs) for planning over world models specified in the Planning Domain Definition Language (PDDL). We prompt LMs to generate Python programs that serve as generalised policies for solving PDDL problems from a given domain. Notably, our approach synthesises policies that are provably sound relative to the PDDL domain without reliance on external verifiers. We conduct experiments on competition benchmarks which show that our policies can solve more PDDL problems than PDDL planners and recent LM approaches within a fixed time and memory constraint. Our approach manifests in the LMPlan planner which can solve planning problems with several hundreds of relevant objects. Surprisingly, we observe that LMs used in our framework sometimes plan more effectively over PDDL problems written in meaningless symbols in place of natural language; e.g. rewriting (at dog kitchen) as (p2 o1 o3). This finding challenges hypotheses that LMs reason over word semantics and memorise solutions from its training corpus, and is worth further exploration.

Problem

Research questions and friction points this paper is trying to address.

Synthesizing sound programmatic policies for PDDL planning

Solving PDDL problems using language models without external verifiers

Challenging assumptions about LM reasoning through symbolic planning

Innovation

Methods, ideas, or system contributions that make the work stand out.

LMs generate Python programs as policies

Synthesizes provably sound policies without verifiers

Effective even with symbolic PDDL representations

🔎 Similar Papers

NL2Plan: Robust LLM-Driven Planning from Minimal Text Descriptions