Strategic Communication under Threat: Learning Information Trade-offs in Pursuit-Evasion Games

📅 2025-10-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In pursuit-evasion adversarial settings, pursuers must balance the situational advantage gained from communicating to acquire evader location information against the survival risk posed by revealing their own position. To address this challenge, we propose the PEEC game model and introduce SHADOW—a novel multi-head sequential reinforcement learning framework enabling joint communication and motion decision-making under partial observability. SHADOW integrates continuous-action control, discrete communication scheduling, and opponent behavior prediction. It employs a joint optimization strategy combining policy gradients and opponent modeling, with ablation studies and sensitivity analyses validating the contribution of each component. Experiments demonstrate that SHADOW significantly improves capture success rates over six baseline methods and exhibits strong generalization across varying communication costs and physical asymmetry scenarios.

Technology Category

Application Category

📝 Abstract
Adversarial environments require agents to navigate a key strategic trade-off: acquiring information enhances situational awareness, but may simultaneously expose them to threats. To investigate this tension, we formulate a PursuitEvasion-Exposure-Concealment Game (PEEC) in which a pursuer agent must decide when to communicate in order to obtain the evader's position. Each communication reveals the pursuer's location, increasing the risk of being targeted. Both agents learn their movement policies via reinforcement learning, while the pursuer additionally learns a communication policy that balances observability and risk. We propose SHADOW (Strategic-communication Hybrid Action Decision-making under partial Observation for Warfare), a multi-headed sequential reinforcement learning framework that integrates continuous navigation control, discrete communication actions, and opponent modeling for behavior prediction. Empirical evaluations show that SHADOW pursuers achieve higher success rates than six competitive baselines. Our ablation study confirms that temporal sequence modeling and opponent modeling are critical for effective decision-making. Finally, our sensitivity analysis reveals that the learned policies generalize well across varying communication risks and physical asymmetries between agents.
Problem

Research questions and friction points this paper is trying to address.

Balancing information acquisition with threat exposure risks
Learning communication policies under partial observability constraints
Integrating opponent modeling with hybrid action decision-making
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-headed reinforcement learning for hybrid actions
Sequential modeling integrates navigation and communication
Opponent behavior prediction enhances strategic decision-making
🔎 Similar Papers
No similar papers found.
Valerio La Gatta
Valerio La Gatta
Northwestern University
disinformation miningexplainable AImultimodal machine learning
D
Dolev Mutzari
Department of Computer Science, Bar Ilan University
Sarit Kraus
Sarit Kraus
Professor Of Computer Science, Bar-Ilan University
Artificial IntelligenceHuman agent interactionMulti-agent Systemsmultiagent systems
V
VS Subrahmanian
Department of Computer Science & Buffett Institute for Global Affairs, Northwestern University