Multi-layer Abstraction for Nested Generation of Options (MANGO) in Hierarchical Reinforcement Learning

📅 2025-08-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenges of task decomposition, low policy learning efficiency, and poor generalization in hierarchical reinforcement learning under long-horizon sparse-reward settings, this paper proposes a cross-layer nested options mechanism coupled with an inner-layer policy guidance strategy. Specifically, it constructs reusable and nestable modular options within an abstract state space to enable hierarchical modeling of high-level actions and transfer of motion patterns across tasks. Task-action integration and customized reward shaping are further employed to enhance decision interpretability and safety guarantees. Experimental results on procedurally generated grid-world environments demonstrate significant improvements: +37% in sample efficiency and +29% in cross-task generalization performance. These findings validate the method’s effectiveness, scalability, and suitability for complex, safety-critical, and industrial applications.

Technology Category

Application Category

📝 Abstract
This paper introduces MANGO (Multilayer Abstraction for Nested Generation of Options), a novel hierarchical reinforcement learning framework designed to address the challenges of long-term sparse reward environments. MANGO decomposes complex tasks into multiple layers of abstraction, where each layer defines an abstract state space and employs options to modularize trajectories into macro-actions. These options are nested across layers, allowing for efficient reuse of learned movements and improved sample efficiency. The framework introduces intra-layer policies that guide the agent's transitions within the abstract state space, and task actions that integrate task-specific components such as reward functions. Experiments conducted in procedurally-generated grid environments demonstrate substantial improvements in both sample efficiency and generalization capabilities compared to standard RL methods. MANGO also enhances interpretability by making the agent's decision-making process transparent across layers, which is particularly valuable in safety-critical and industrial applications. Future work will explore automated discovery of abstractions and abstract actions, adaptation to continuous or fuzzy environments, and more robust multi-layer training strategies.
Problem

Research questions and friction points this paper is trying to address.

Addresses long-term sparse reward challenges in reinforcement learning
Decomposes complex tasks into multiple abstraction layers
Improves sample efficiency and generalization in hierarchical RL
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical reinforcement learning with multilayer abstraction
Nested options for efficient movement reuse
Intra-layer policies for abstract state transitions
🔎 Similar Papers
No similar papers found.
A
Alessio Arcudi
Human Inspired Technology Research Center, Università di Padova, Padova, PD 35121 IT
Davide Sartor
Davide Sartor
Università degli Studi di Padova
Alberto Sinigaglia
Alberto Sinigaglia
PhD student
Deep Reinforcement LearningDeep Learning
Vincent François-Lavet
Vincent François-Lavet
VU Amsterdam
reinforcement learningdeep learningrepresentation learningmachine learningartificial intelligence
G
Gian Antonio Susto
Dep. of Information Engineering, Università di Padova, Padova, PD 35121 IT