🤖 AI Summary
This work addresses the significant performance degradation of conventional multi-robot task allocation methods under communication constraints or adversarial interference. The paper proposes the first communication-free task allocation framework, which achieves implicit coordination by predicting teammates’ competitive bids and is trained end-to-end to minimize allocation regret. The approach innovatively integrates decision-focused learning with mean-field approximation and bid distribution prediction, effectively handling task clustering and spatial heterogeneity. Experimental results demonstrate a 17% improvement in system reward over baseline methods in a 16-robot, 64-task scenario—approaching the performance of optimal MILP solutions—and maintains a 7% gain even in large-scale settings with 256 robots and 4,096 tasks. The model requires only 21 seconds for training, exhibiting strong scalability and generalization capabilities.
📝 Abstract
Most multi-robot task allocation methods rely on communication to resolve conflicts and reach consistent assignments. In environments with limited bandwidth, degraded infrastructure, or adversarial interference, existing approaches degrade sharply. We introduce a learning-based framework that achieves high-quality task allocation without any robot-to-robot communication. The key idea is that robots coordinate implicitly by predicting teammates' bids: if each robot can anticipate competition for a task, it can adjust its choices accordingly. Our method predicts bid distributions to correct systematic errors in analytical mean-field approximations. While analytical predictions assume idealized conditions (uniform distributions, known bid functions), our learned approach adapts to task clustering and spatial heterogeneity. Inspired by Smart Predict-then-Optimize (SPO), we train predictors end-to-end to minimize Task Allocation Regret rather than prediction error. To scale to large swarms, we develop a mean-field approximation where each robot predicts the distribution of competing bids rather than individual bids, reducing complexity from $O(NT)$ to $O(T)$. We call our approach FORMICA: Field-Oriented Regret-Minimizing Implicit Coordination Algorithm. Experiments show FORMICA substantially outperforms a natural analytical baseline. In scenarios with 16 robots and 64 tasks, our approach improves system reward by 17% and approaches the optimal MILP solution. When deployed on larger scenarios (256 robots, 4096 tasks), the same model improves performance by 7%, demonstrating strong generalization. Training requires only 21 seconds on a laptop, enabling rapid adaptation to new environments.