🤖 AI Summary
This work addresses the training-test scale mismatch problem in cross-domain generalization. Conventional fixed validation sets fail to support robust generalization from small-scale training to large-scale testing. To resolve this, we propose a dynamic validation set generation mechanism coupled with confidence-driven test instance sampling. Specifically, we introduce a graph neural network (GNN)-based dynamic instance generation method that adaptively expands the validation set size while preserving informativeness and feasibility. Concurrently, we develop a statistically grounded, confidence-aware framework for systematic test distribution design, moving beyond static validation paradigms. Evaluated across nine benchmark domains, our approach significantly improves the cross-scale generalization performance of GNN-based policies. Empirical results demonstrate consistent gains in generalization robustness, validating the broad applicability and effectiveness of dynamic validation in mitigating scale mismatch.
📝 Abstract
Recent work has shown that successful per-domain generalizing action policies can be learned. Scaling behavior, from small training instances to large test instances, is the key objective; and the use of validation instances larger than training instances is one key to achieve it. Prior work has used fixed validation sets. Here, we introduce a method generating the validation set dynamically, on the fly, increasing instance size so long as informative and feasible.We also introduce refined methodology for evaluating scaling behavior, generating test instances systematically to guarantee a given confidence in coverage performance for each instance size. In experiments, dynamic validation improves scaling behavior of GNN policies in all 9 domains used.