Hierarchical Policy-Gradient Reinforcement Learning for Multi-Agent Shepherding Control of Non-Cohesive Targets

📅 2025-04-03

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Multi-agent herding control for non-cohesive targets under no prior dynamical models, limited sensing range, and decentralized constraints. Method: We propose a hierarchical policy gradient reinforcement learning framework based on Proximal Policy Optimization (PPO). It unifies target selection and driving actions within a continuous action space and employs a hierarchical action structure with decentralized execution—eliminating reliance on discrete actions and system priors inherent in traditional DQN-based approaches. Contribution/Results: To our knowledge, this is the first work to jointly model target selection and herding control end-to-end in continuous action space. Experiments demonstrate robust convergence under scalability (increasing target count) and sensing limitations, achieving significantly improved trajectory smoothness and task success rate. The framework exhibits strong scalability and robustness, validating its effectiveness for real-world decentralized multi-agent herding tasks.

Technology Category

Application Category

📝 Abstract

We propose a decentralized reinforcement learning solution for multi-agent shepherding of non-cohesive targets using policy-gradient methods. Our architecture integrates target-selection with target-driving through Proximal Policy Optimization, overcoming discrete-action constraints of previous Deep Q-Network approaches and enabling smoother agent trajectories. This model-free framework effectively solves the shepherding problem without prior dynamics knowledge. Experiments demonstrate our method's effectiveness and scalability with increased target numbers and limited sensing capabilities.

Problem

Research questions and friction points this paper is trying to address.

Decentralized multi-agent control of non-cohesive targets

Overcoming discrete-action constraints in shepherding tasks

Model-free shepherding without prior dynamics knowledge

Innovation

Methods, ideas, or system contributions that make the work stand out.

Decentralized reinforcement learning for multi-agent shepherding

Policy-gradient methods with Proximal Policy Optimization

Model-free framework without prior dynamics knowledge

🔎 Similar Papers

No similar papers found.

Authors to Follow