Shielded Reinforcement Learning Under Dynamic Temporal Logic Constraints

📅 2026-03-17

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the challenge of enforcing Signal Temporal Logic (STL) specifications in reinforcement learning for complex dynamical spatiotemporal tasks—such as visiting moving targets or executing periodic behaviors—where existing approaches are largely restricted to static safety constraints. The paper proposes a novel framework that integrates sequential control barrier functions with model-free reinforcement learning, enabling strict satisfaction of rich STL specifications involving dynamically moving targets with unknown trajectories throughout the entire learning process. By doing so, it transcends the conventional paradigm limited to static safety guarantees. Extensive simulations demonstrate the method’s effectiveness and robustness in handling intricate dynamic STL tasks.

Technology Category

Application Category

📝 Abstract

Reinforcement Learning (RL) has shown promise in various robotics applications, yet its deployment on real systems is still limited due to safety and operational constraints. The safe RL field has gained considerable attention in recent years, which focuses on imposing safety constraints throughout the learning process. However, real systems often require more complex constraints than just safety, such as periodic recharging or time-bounded visits to specific regions. Imposing such spatio-temporal tasks during learning still remains a challenge. Signal Temporal Logic (STL) is a formal language for specifying temporal properties of real-valued signals and provides a way to express such complex tasks. In this paper, we propose a framework that leverages sequential control barrier functions and model-free RL to ensure that the given STL tasks are satisfied throughout the learning process. Our method extends beyond traditional safety constraints by enforcing rich STL specifications, which can involve visits to dynamic targets with unknown trajectories. We also demonstrate the effectiveness of our framework through various simulations.

Problem

Research questions and friction points this paper is trying to address.

Reinforcement Learning

Signal Temporal Logic

Safety Constraints

Spatio-temporal Tasks

Dynamic Targets

Innovation

Methods, ideas, or system contributions that make the work stand out.

Shielded Reinforcement Learning

Signal Temporal Logic

Control Barrier Functions

Dynamic Temporal Constraints

Model-free RL

🔎 Similar Papers

No similar papers found.

Authors to Follow