🤖 AI Summary
Existing neural solvers for multi-objective combinatorial optimization problems often lack robustness on out-of-distribution and complex instances. This work proposes the first preference-guided adversarial attack and defense framework tailored to such problems. The approach generates challenging samples through preference-conditioned adversarial attacks and enhances model generalization via adversarial training integrated with a hardness-aware preference selection mechanism. Evaluated on standard benchmarks—including the Multi-Objective Traveling Salesman Problem (MOTSP), Multi-Objective Capacitated Vehicle Routing Problem (MOCVRP), and Multi-Objective Knapsack Problem (MOKP)—the method demonstrates significant improvements in both performance and stability under difficult and out-of-distribution scenarios.
📝 Abstract
Deep reinforcement learning (DRL) has shown great promise in addressing multi-objective combinatorial optimization problems (MOCOPs). Nevertheless, the robustness of these learning-based solvers has remained insufficiently explored, especially across diverse and complex problem distributions. In this paper, we propose a unified robustness-oriented framework for preference-conditioned DRL solvers for MOCOPs. Within this framework, we develop a preference-based adversarial attack to generate hard instances that expose solver weaknesses, and quantify the attack impact by the resulting degradation on Pareto-front quality. We further introduce a defense strategy that integrates hardness-aware preference selection into adversarial training to reduce overfitting to restricted preference regions and improve out-of-distribution performance. The experimental results on multi-objective traveling salesman problem (MOTSP), multi-objective capacitated vehicle routing problem (MOCVRP), and multi-objective knapsack problem (MOKP) verify that our attack method successfully learns hard instances for different solvers. Furthermore, our defense method significantly strengthens the robustness and generalizability of neural solvers, delivering superior performance on hard or out-of-distribution instances.