🤖 AI Summary
This work addresses the limitations of conventional physics-informed machine learning evaluation, which overly relies on curve-fitting errors and fails to meet practical engineering decision-making needs—such as candidate ranking, avoiding infeasible designs, and minimizing quality regret. To bridge this gap, the authors propose a decision-oriented evaluation paradigm and introduce pinn-gym, an open-source benchmark that integrates reduced-order physical surrogates, five printable polymer cards, and dimensionless force-response targets. The framework enables comprehensive assessment across multiple criteria, including curve fidelity, physical plausibility, Top-k retrieval accuracy, and quality regret. Experimental results demonstrate that low normalized root-mean-square error alone is insufficient to ensure effective design selection, and that physics-informed losses primarily shift performance trade-offs rather than uniformly improving all metrics. pinn-gym thus provides a reproducible, cross-material, system-level testbed for evaluating physics-informed surrogate models in decision-critical applications.
📝 Abstract
Physics-informed machine learning is often assessed by curve error, although engineering use depends on downstream decisions: ranking candidates, avoiding infeasible designs and limiting regret. We introduce pinn-gym, an open benchmark for material-conditioned lattice design that couples a transparent reduced-order crush-and-impact oracle with five printable polymer cards, dimensionless force-response targets and a protocol spanning curve fidelity, physical admissibility, top-k retrieval and mass regret.
Across per-material, pooled and cross-material settings, low nRMSE is frequently insufficient to identify useful design selections. Physics-informed losses alter trade-offs rather than monotonically improving all metrics, and dimensionless conditioning improves comparability without making transfer symmetric. The benchmark is not a certified material model; within the released oracle, candidate generator and material cards, pinn-gym provides a reproducible testbed for evaluating PIML surrogates as decision systems rather than curve predictors alone.