Composable Model-Free RL for Navigation with Input-Affine Systems

📅 2026-02-13
📈 Citations: 0
Influential: 0
📄 PDF

Technology Category

Application Category

📝 Abstract
As autonomous robots move into complex, dynamic real-world environments, they must learn to navigate safely in real time, yet anticipating all possible behaviors is infeasible. We propose a composable, model-free reinforcement learning method that learns a value function and an optimal policy for each individual environment element (e.g., goal or obstacle) and composes them online to achieve goal reaching and collision avoidance. Assuming unknown nonlinear dynamics that evolve in continuous time and are input-affine, we derive a continuous-time Hamilton-Jacobi-Bellman (HJB) equation for the value function and show that the corresponding advantage function is quadratic in the action and optimal policy. Based on this structure, we introduce a model-free actor-critic algorithm that learns policies and value functions for static or moving obstacles using gradient descent. We then compose multiple reach/avoid models via a quadratically constrained quadratic program (QCQP), yielding formal obstacle-avoidance guarantees in terms of value-function level sets, providing a model-free alternative to CLF/CBF-based controllers. Simulations demonstrate improved performance over a PPO baseline applied to a discrete-time approximation.
Problem

Research questions and friction points this paper is trying to address.

navigation
obstacle avoidance
input-affine systems
model-free reinforcement learning
real-time safety
Innovation

Methods, ideas, or system contributions that make the work stand out.

composable reinforcement learning
input-affine systems
Hamilton-Jacobi-Bellman equation
quadratically constrained quadratic program
model-free navigation
🔎 Similar Papers
No similar papers found.
X
Xinhuan Sang
Boston University, Boston MA 02215, USA
A
Abdelrahman Abdelgawad
Boston University, Boston MA 02215, USA
Roberto Tron
Roberto Tron
Associate Professor - Boston University
Automatic ControlRoboticsComputer VisionRiemannian geometryOptimization