🤖 AI Summary
Multi-agent human-robot collaborative navigation faces challenges including high behavioral uncertainty, severe sensor observation noise, and insufficient utilization of both offline and online data. Method: This paper proposes a differentiable hybrid strategic game framework that— for the first time—integrates Conditional Variational Autoencoders (CVAEs) into inverse game-theoretic modeling, jointly capturing behavioral stochasticity, sensor noise, and online adaptability. The approach unifies generative trajectory modeling, differentiable game-theoretic optimization, and inverse reinforcement learning to enable high-dimensional, multimodal behavioral distribution estimation and real-time policy adaptation. Results: On simulated navigation benchmarks, the method recovers near-Nash-equilibrium actions even under model mismatch, ambiguous goals, and strong observation noise—achieving performance comparable to ground-truth models and oracle baselines. It significantly enhances the robustness and practical applicability of inverse game-theoretic reasoning in realistic human-robot interaction scenarios.
📝 Abstract
Game-theoretic models are effective tools for modeling multi-agent interactions, especially when robots need to coordinate with humans. However, applying these models requires inferring their specifications from observed behaviors -- a challenging task known as the inverse game problem. Existing inverse game approaches often struggle to account for behavioral uncertainty and measurement noise, and leverage both offline and online data. To address these limitations, we propose an inverse game method that integrates a generative trajectory model into a differentiable mixed-strategy game framework. By representing the mixed strategy with a conditional variational autoencoder (CVAE), our method can infer high-dimensional, multi-modal behavior distributions from noisy measurements while adapting in real-time to new observations. We extensively evaluate our method in a simulated navigation benchmark, where the observations are generated by an unknown game model. Despite the model mismatch, our method can infer Nash-optimal actions comparable to those of the ground-truth model and the oracle inverse game baseline, even in the presence of uncertain agent objectives and noisy measurements.