Per-Domain Generalizing Policies: On Validation Instances and Scaling Behavior

📅 2025-05-01

📈 Citations: 0

✨ Influential: 0

career value

188K/year

🤖 AI Summary

This work addresses the training-test scale mismatch problem in cross-domain generalization. Conventional fixed validation sets fail to support robust generalization from small-scale training to large-scale testing. To resolve this, we propose a dynamic validation set generation mechanism coupled with confidence-driven test instance sampling. Specifically, we introduce a graph neural network (GNN)-based dynamic instance generation method that adaptively expands the validation set size while preserving informativeness and feasibility. Concurrently, we develop a statistically grounded, confidence-aware framework for systematic test distribution design, moving beyond static validation paradigms. Evaluated across nine benchmark domains, our approach significantly improves the cross-scale generalization performance of GNN-based policies. Empirical results demonstrate consistent gains in generalization robustness, validating the broad applicability and effectiveness of dynamic validation in mitigating scale mismatch.

Technology Category

Application Category

📝 Abstract

Recent work has shown that successful per-domain generalizing action policies can be learned. Scaling behavior, from small training instances to large test instances, is the key objective; and the use of validation instances larger than training instances is one key to achieve it. Prior work has used fixed validation sets. Here, we introduce a method generating the validation set dynamically, on the fly, increasing instance size so long as informative and feasible.We also introduce refined methodology for evaluating scaling behavior, generating test instances systematically to guarantee a given confidence in coverage performance for each instance size. In experiments, dynamic validation improves scaling behavior of GNN policies in all 9 domains used.

Problem

Research questions and friction points this paper is trying to address.

Learning per-domain generalizing action policies effectively

Scaling behavior from small to large instances

Dynamic validation set generation for improved performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic validation set generation method

Refined scaling behavior evaluation methodology

Improved GNN policies in 9 domains

🔎 Similar Papers

No similar papers found.