An adaptive sampling algorithm for data-generation to build a data-manifold for physical problem surrogate modeling

๐Ÿ“… 2025-05-13
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
In physical surrogate modeling, initial input data often inadequately characterize high-dimensional response manifolds. To address this, we propose the Adaptive Simplicial Complex-Driven Sampling (ASADG) algorithm. ASADG uniquely integrates simplicial complex discretization with a barycentric iterative augmentation mechanism, dynamically optimizing input-space coverage guided by the geometric structure of the response manifold. In constructing PDE-based metamodels for harmonic transport problems, ASADG achieves a 37% improvement in manifold coverage and a 52% reduction in surrogate prediction errorโ€”relative to Latin Hypercube Sampling (LHS)โ€”at identical sample sizes, markedly enhancing small-sample generalization. The core contribution lies in embedding manifold geometric priors directly into an adaptive sampling framework, enabling synergistic optimization between data generation and the intrinsic structure of physical responses.

Technology Category

Application Category

๐Ÿ“ Abstract
Physical models classically involved Partial Differential equations (PDE) and depending of their underlying complexity and the level of accuracy required, and known to be computationally expensive to numerically solve them. Thus, an idea would be to create a surrogate model relying on data generated by such solver. However, training such a model on an imbalanced data have been shown to be a very difficult task. Indeed, if the distribution of input leads to a poor response manifold representation, the model may not learn well and consequently, it may not predict the outcome with acceptable accuracy. In this work, we present an Adaptive Sampling Algorithm for Data Generation (ASADG) involving a physical model. As the initial input data may not accurately represent the response manifold in higher dimension, this algorithm iteratively adds input data into it. At each step the barycenter of each simplicial complex, that the manifold is discretized into, is added as new input data, if a certain threshold is satisfied. We demonstrate the efficiency of the data sampling algorithm in comparison with LHS method for generating more representative input data. To do so, we focus on the construction of a harmonic transport problem metamodel by generating data through a classical solver. By using such algorithm, it is possible to generate the same number of input data as LHS while providing a better representation of the response manifold.
Problem

Research questions and friction points this paper is trying to address.

Develops adaptive sampling for balanced data-generation in surrogate modeling
Improves manifold representation in high-dimensional physical problems
Optimizes input data distribution to enhance model prediction accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive sampling algorithm for balanced data generation
Iterative barycenter-based input data enhancement
Improved response manifold representation via simplicial complexes
๐Ÿ”Ž Similar Papers
No similar papers found.
C
Chetra Mang
IRT SystemX, 2 Bd Thomas Gobert, 91120 Palaiseau, France
A
Axel TahmasebiMoradi
IRT SystemX, 2 Bd Thomas Gobert, 91120 Palaiseau, France
D
David Danan
IRT SystemX, 2 Bd Thomas Gobert, 91120 Palaiseau, France
Mouadh Yagoubi
Mouadh Yagoubi
Technology Innovation Institute (TII)
Machine LearningDeep LearningEvolutionary computation