Parametrized Multi-Agent Routing via Deep Attention Models

📅 2025-07-29

📈 Citations: 0

✨ Influential: 0

career value

218K/year

🤖 AI Summary

This paper addresses the NP-hard mixed discrete-continuous optimization problem of multi-agent facility location and path optimization (FLPO). We propose SPN, a deep neural policy model grounded in the maximum entropy principle and permutation invariance. SPN employs an attention-driven encoder-decoder architecture that jointly models continuous facility locations and discrete path assignments within a unified, end-to-end differentiable framework. Compared to exact solvers (e.g., Gurobi) and conventional heuristics, SPN reduces transportation cost by over one order of magnitude while accelerating inference by 1500×, achieving solution quality near the optimum. Its permutation-invariant design ensures strong scalability with respect to agent count, enabling efficient and robust joint optimization for large-scale multi-agent systems. This work establishes a novel paradigm for scalable, learning-based FLPO.

Technology Category

Application Category

📝 Abstract

We propose a scalable deep learning framework for parametrized sequential decision-making (ParaSDM), where multiple agents jointly optimize discrete action policies and shared continuous parameters. A key subclass of this setting arises in Facility-Location and Path Optimization (FLPO), where multi-agent systems must simultaneously determine optimal routes and facility locations, aiming to minimize the cumulative transportation cost within the network. FLPO problems are NP-hard due to their mixed discrete-continuous structure and highly non-convex objective. To address this, we integrate the Maximum Entropy Principle (MEP) with a neural policy model called the Shortest Path Network (SPN)-a permutation-invariant encoder-decoder that approximates the MEP solution while enabling efficient gradient-based optimization over shared parameters. The SPN achieves up to 100$ imes$ speedup in policy inference and gradient computation compared to MEP baselines, with an average optimality gap of approximately 6% across a wide range of problem sizes. Our FLPO approach yields over 10$ imes$ lower cost than metaheuristic baselines while running significantly faster, and matches Gurobi's optimal cost with annealing at a 1500$ imes$ speedup-establishing a new state of the art for ParaSDM problems. These results highlight the power of structured deep models for solving large-scale mixed-integer optimization tasks.

Problem

Research questions and friction points this paper is trying to address.

Optimize multi-agent routes and facility locations jointly

Address NP-hard mixed discrete-continuous optimization challenges

Achieve scalable solutions for large-scale routing problems

Innovation

Methods, ideas, or system contributions that make the work stand out.

Deep learning framework for sequential decision-making

Neural policy model with Maximum Entropy Principle

Permutation-invariant encoder-decoder for efficient optimization

🔎 Similar Papers

No similar papers found.