Neuro-Symbolic Learning for Long-Horizon Task Planning Under Complex Logical Constraints

📅 2026-06-04

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the challenges of exposure bias and efficiency bottlenecks in long-horizon robotic task planning under complex logical constraints. The authors propose a two-level optimization framework grounded in instruction learning: an upper-level neural scorer learns to assess object importance, while a lower-level symbolic planner solves the planning problem within a pruned search space. To stabilize training and provide reliable feedback, they introduce a novel 3R recovery mechanism that integrates Repair, Restart, and Rollback strategies. Evaluated on three benchmarks, the method achieves state-of-the-art performance, reducing failure rates by 80.04% and planning time by 57.14%. Its effectiveness is further validated in both simulation and real-world experiments on a quadrupedal mobile manipulation platform.

📝 Abstract

Task planning often suffers from severe efficiency bottlenecks when robots must reason over long-horizon action sequences under complex logical constraints, including object affordances, spatial relationships, and sequential action dependencies. Recent neuro-symbolic methods improve planning efficiency by learning object-importance scores to prune task-irrelevant objects, but they typically rely on fixed offline supervision generated from full search spaces. This creates a train-test mismatch: at deployment, the planner operates in pruned search spaces induced by the model's own imperfect predictions, leading to exposure bias and degraded planning performance. To address this challenge, we formulate object-importance learning for task planning as an imperative learning-based bilevel optimization problem. The upper level optimizes a neural scorer, while the lower level solves a symbolic planning problem in the score-pruned search space. To stabilize this learning process, we introduce a 3R strategy into the lower-level planning, using parallel Repair, Restart, and Rollback recovery to provide reliable and adaptive feedback for upper-level learning. Experiments on three challenging benchmarks demonstrate state-of-the-art performance, including an 80.04% reduction in failure rate and a 57.14% reduction in planning time. We further validate the framework on a quadruped-based mobile manipulator in simulation and the real world, demonstrating its potential for efficient and deployable neuro-symbolic task planning.

Problem

Research questions and friction points this paper is trying to address.

long-horizon task planning

complex logical constraints

neuro-symbolic learning

train-test mismatch

exposure bias

Innovation

Methods, ideas, or system contributions that make the work stand out.

Neuro-Symbolic Learning

Bilevel Optimization

Imitation Learning