🤖 AI Summary
This paper addresses the Constrained Stochastic Shortest Path (CSSP) problem—minimizing a primary cost (e.g., time) in a Markov Decision Process while satisfying multiple secondary cost constraints (e.g., budget). Conventional linear programming approaches struggle with both multiple constraints and large state spaces. To overcome these limitations, we propose CARL: a novel algorithm that dynamically couples scalarization with subgradient-type optimization, integrating vector-cost projection and heuristic search to synthesize near-optimal policies. Unlike model-based LP solvers, CARL operates without exact model knowledge and supports online policy adaptation. Evaluated on standard benchmarks, CARL achieves a 50% higher success rate than current state-of-the-art methods, significantly expanding the tractable scale and practical applicability of CSSP in resource-constrained stochastic path planning.
📝 Abstract
Constrained Stochastic Shortest Path Problems (CSSPs) model problems with probabilistic effects, where a primary cost is minimised subject to constraints over secondary costs, e.g., minimise time subject to monetary budget. Current heuristic search algorithms for CSSPs solve a sequence of increasingly larger CSSPs as linear programs until an optimal solution for the original CSSP is found. In this paper, we introduce a novel algorithm CARL, which solves a series of unconstrained Stochastic Shortest Path Problems (SSPs) with efficient heuristic search algorithms. These SSP subproblems are constructed with scalarisations that project the CSSP's vector of primary and secondary costs onto a scalar cost. CARL finds a maximising scalarisation using an optimisation algorithm similar to the subgradient method which, together with the solution to its associated SSP, yields a set of policies that are combined into an optimal policy for the CSSP. Our experiments show that CARL solves 50% more problems than the state-of-the-art on existing benchmarks.