Strategic Communication under Threat: Learning Information Trade-offs in Pursuit-Evasion Games

📅 2025-10-09

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

In pursuit-evasion adversarial settings, pursuers must balance the situational advantage gained from communicating to acquire evader location information against the survival risk posed by revealing their own position. To address this challenge, we propose the PEEC game model and introduce SHADOW—a novel multi-head sequential reinforcement learning framework enabling joint communication and motion decision-making under partial observability. SHADOW integrates continuous-action control, discrete communication scheduling, and opponent behavior prediction. It employs a joint optimization strategy combining policy gradients and opponent modeling, with ablation studies and sensitivity analyses validating the contribution of each component. Experiments demonstrate that SHADOW significantly improves capture success rates over six baseline methods and exhibits strong generalization across varying communication costs and physical asymmetry scenarios.

Technology Category

Application Category

📝 Abstract

Adversarial environments require agents to navigate a key strategic trade-off: acquiring information enhances situational awareness, but may simultaneously expose them to threats. To investigate this tension, we formulate a PursuitEvasion-Exposure-Concealment Game (PEEC) in which a pursuer agent must decide when to communicate in order to obtain the evader's position. Each communication reveals the pursuer's location, increasing the risk of being targeted. Both agents learn their movement policies via reinforcement learning, while the pursuer additionally learns a communication policy that balances observability and risk. We propose SHADOW (Strategic-communication Hybrid Action Decision-making under partial Observation for Warfare), a multi-headed sequential reinforcement learning framework that integrates continuous navigation control, discrete communication actions, and opponent modeling for behavior prediction. Empirical evaluations show that SHADOW pursuers achieve higher success rates than six competitive baselines. Our ablation study confirms that temporal sequence modeling and opponent modeling are critical for effective decision-making. Finally, our sensitivity analysis reveals that the learned policies generalize well across varying communication risks and physical asymmetries between agents.

Problem

Research questions and friction points this paper is trying to address.

Balancing information acquisition with threat exposure risks

Learning communication policies under partial observability constraints

Integrating opponent modeling with hybrid action decision-making

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-headed reinforcement learning for hybrid actions

Sequential modeling integrates navigation and communication

Opponent behavior prediction enhances strategic decision-making

🔎 Similar Papers

No similar papers found.

Authors to Follow