🤖 AI Summary
This study addresses the lack of systematic understanding regarding the necessity of joint training in multi-agent approaches for job shop scheduling with transportation resources. Through sensitivity analysis, it quantifies for the first time the coordination gap between joint and modular training paradigms and systematically evaluates their performance under varying conditions of resource scarcity and temporal dominance. The findings reveal that joint training generally outperforms both dispatching rules and modular methods across most scenarios; however, its advantage diminishes significantly in bottleneck environments where either transportation or processing is severely constrained. This suggests that modular training retains practical value in task-dominated settings. The results provide an environment-dependent basis for selecting appropriate multi-agent training paradigms in complex scheduling contexts.
📝 Abstract
Efficient job-shop scheduling with transportation resources is critical for high-performance manufacturing. With the rise of "decentralized factories", multi-agent reinforcement learning has emerged as a promising approach for the combined scheduling of production and transportation tasks. Prior work has largely focused on developing novel cooperative architectures while overlooking the question of when joint training is necessary. Joint training denotes the simultaneous training of job and automatic guided vehicle scheduling agents, whereas modular training involves independently training each agent followed by post-hoc integration. In this study, we systematically investigate the conditions under which joint training is essential for optimal performance in the job-shop scheduling problem with transportation resources. Through a rigorous sensitivity analysis of resource scarcity and temporal dominance, we quantify the coordination gap -- the performance difference between these two training modalities. In our evaluation, the joint training can produce superior performance compared to the best-performing combinations of dispatching rules and modular training. However, the coordination gap advantage diminishes in bottleneck environments, particularly under severe transport and processing constraints. These findings indicate that modular training represents a viable alternative in environments where a single scheduling task dominates. Overall, our work provides practical guidance for selecting between training modalities based on environmental conditions, enabling decision-makers to optimize reinforcement learning-based scheduling performance.