On Tackling Complex Tasks with Reward Machines and Signal Temporal Logics

📅 2026-04-09

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

This work addresses the challenge of efficiently solving reinforcement learning tasks subject to complex temporal logic constraints by proposing a novel framework that integrates Signal Temporal Logic (STL) into reward machines. The approach leverages STL specifications to generate events and construct structured rewards, dynamically guiding the agent’s policy learning through an online STL monitoring algorithm to satisfy formal specifications. As the first study to combine STL with reward machines, it achieves compact reward representations and efficient training for intricate tasks. Empirical evaluations in Minigrid, Cart-Pole, and Highway environments demonstrate the method’s effectiveness and strong generalization capabilities on non-trivial tasks requiring precise temporal reasoning.

Technology Category

Application Category

📝 Abstract

We propose a Reinforcement Learning (RL) based control design framework for handling complex tasks. The approach extends the concept of Reward Machines (RM) with Signal Temporal Logic (STL) formulas that can be used for event generation. The use of STL allows not only a more efficient representation of rewards for complex tasks but also guiding the training process to converge towards behaviors satisfying specified requirements. We also propose an implementation of the framework that leverages the STL online monitoring algorithms. We illustrate the framework with three case studies (minigrid, cart-pole and high-way environments) with non-trivial tasks.

Problem

Research questions and friction points this paper is trying to address.

Reward Machines

Signal Temporal Logic

Reinforcement Learning

Complex Tasks

Temporal Specifications

Innovation

Methods, ideas, or system contributions that make the work stand out.

Reward Machines

Signal Temporal Logic

Reinforcement Learning