🤖 AI Summary
This study addresses the challenge of efficiently predicting the discharge capacity of piano key weirs (PKWs), which is governed by complex interactions between geometric configurations and hydraulic conditions. Conventional CFD simulations are computationally expensive and lack structured datasets for developing high-fidelity surrogate models. To bridge this gap, the authors introduce WeirNet—a benchmark dataset comprising 71,387 high-fidelity OpenFOAM simulations across 3,794 parametric PKW geometries under 19 flow scenarios—providing the first large-scale, multimodal (parameters, watertight meshes, point clouds) resource with a standardized evaluation protocol. Experiments show that parameter-based tree models achieve the highest accuracy, while geometry-agnostic models remain competitive. All surrogate models deliver millisecond-level inference, offering speedups of several orders of magnitude over CFD. Performance degradation in extrapolation is primarily attributed to geometric distribution shifts. This work advances reproducible, data-driven research in hydraulic structure design.
📝 Abstract
Reliable prediction of hydraulic performance is challenging for Piano Key Weir (PKW) design because discharge capacity depends on three-dimensional geometry and operating conditions. Surrogate models can accelerate hydraulic-structure design, but progress is limited by scarce large, well-documented datasets that jointly capture geometric variation, operating conditions, and functional performance. This study presents WeirNet, a large 3D CFD benchmark dataset for geometric surrogate modeling of PKWs. WeirNet contains 3,794 parametric, feasibility-constrained rectangular and trapezoidal PKW geometries, each scheduled at 19 discharge conditions using a consistent free-surface OpenFOAM workflow, resulting in 71,387 completed simulations that form the benchmark and with complete discharge coefficient labels. The dataset is released as multiple modalities compact parametric descriptors, watertight surface meshes and high-resolution point clouds together with standardized tasks and in-distribution and out-of-distribution splits. Representative surrogate families are benchmarked for discharge coefficient prediction. Tree-based regressors on parametric descriptors achieve the best overall accuracy, while point- and mesh-based models remain competitive and offer parameterization-agnostic inference. All surrogates evaluate in milliseconds per sample, providing orders-of-magnitude speedups over CFD runtimes. Out-of-distribution results identify geometry shift as the dominant failure mode compared to unseen discharge values, and data-efficiency experiments show diminishing returns beyond roughly 60% of the training data. By publicly releasing the dataset together with simulation setups and evaluation pipelines, WeirNet establishes a reproducible framework for data-driven hydraulic modeling and enables faster exploration of PKW designs during the early stages of hydraulic planning.