🤖 AI Summary
Initial task allocation (ITA) in multi-human, multi-robot collaboration faces three key challenges: difficulty in forming heterogeneous agent-complementary pairings, weak alignment with multiple human objectives, and poor adaptability to dynamic environments.
Method: This paper proposes a rule-guided and experience-enhanced large language model (LLM) reasoning framework. It integrates a symbolic rule engine, contextualized LLM inference, experience cache retrieval, and policy distillation to achieve lightweight, interpretable, human-intervenable, and evolvable ITA decisions.
Contribution/Results: Compared to existing learning-based approaches, our method significantly reduces computational overhead while enabling explicit injection of multi-objective user preferences and real-time response to team reconfiguration. Experiments demonstrate improved preference alignment and faster reaction to emergent events in both single- and multi-objective ITA tasks. Moreover, it effectively enhances pre-trained policies, yielding higher task completion rates and greater collaborative robustness.
📝 Abstract
Multi-human multi-robot teams are increasingly recognized for their efficiency in executing large-scale, complex tasks by integrating heterogeneous yet potentially synergistic humans and robots. However, this inherent heterogeneity presents significant challenges in teaming, necessitating efficient initial task allocation (ITA) strategies that optimally form complementary human-robot pairs or collaborative chains and establish well-matched task distributions. While current learning-based methods demonstrate promising performance, they often incur high computational costs and lack the flexibility to incorporate user preferences in multi-objective optimization (MOO) or adapt to last-minute changes in dynamic real-world environments. To address these limitations, we propose REBEL, an LLM-based ITA framework that integrates rule-based and experience-enhanced learning to enhance LLM reasoning capabilities and improve in-context adaptability to MOO and situational changes. Extensive experiments validate the effectiveness of REBEL in both single-objective and multi-objective scenarios, demonstrating superior alignment with user preferences and enhanced situational awareness to handle unexpected team composition changes. Additionally, we show that REBEL can complement pre-trained ITA policies, further boosting situational adaptability and overall team performance. Website at https://sites.google.com/view/ita-rebel .