DeepJEB++: Foundation Model-Driven Large-Scale 3D Engineering Dataset via 2D Latent Space Augmentation

📅 2026-06-11

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the limitations of data-driven engineering design caused by the scarcity of large-scale 3D datasets pairing geometry with physical performance, the inadequacy of conventional augmentation methods in preserving fine geometric diversity, and the challenges of automated simulation-based annotation. The authors propose DeepJEB++, a novel framework that first fine-tunes a 2D diffusion model on multi-view renders and leverages a vision-language model (VLM) to filter manufacturable designs. It then reconstructs high-quality meshes using a domain-adapted 3D generative foundation model and automatically identifies interfaces for finite element analysis (FEA) labeling. By innovatively performing efficient geometric augmentation in 2D latent space, integrating VLM-based quality filtering, and enabling end-to-end FEA annotation, the method generates 15,360 high-fidelity simulated 3D scaffolds from fewer than 400 seed designs—achieving a 40× data expansion on a single GPU while demonstrating manufacturing feasibility, label accuracy, and distributional consistency.

📝 Abstract

Data-driven engineering design is constrained by the lack of large-scale 3D datasets that pair geometry with physics-based performance labels. In particular, existing 3D data augmentation techniques have limitations in preserving subtle and diverse geometric variations, and it remains difficult to automate the subsequent simulation-labeling process, where boundary conditions vary depending on the generated geometry. We present DeepJEB++, a foundation-model-driven data-augmentation framework that expands a small seed set of jet engine brackets into a large, simulation-labeled 3D dataset under constrained resources. Our key idea is to augment in the data-rich 2D latent space, then transfer to 3D. In Stage 1, we fine-tune a pretrained 2D latent diffusion model on multi-view renders and synthesize novel views by latent interpolation, retaining manufacturable designs through a vision-language-model (VLM) quality filter. In Stage 2, the validated images are lifted to 3D meshes by a domain-adapted generative foundation model. In Stage 3, an automated pipeline recognizes the load and bolt interfaces on each mesh and assigns finite-element labels -- mass, stress, and displacement -- without manual intervention. We assess augmentation quality along three intrinsic axes: manufacturability, label fidelity against the SimJEB ground truth, and distributional consistency. Starting from fewer than 400 seed designs, DeepJEB++ yields 15,360 simulation-labeled 3D brackets -- a 40x expansion -- using a single GPU per stage. The dataset will be made publicly available to support reproducible engineering-AI research.

Problem

Research questions and friction points this paper is trying to address.

3D engineering dataset

data augmentation

physics-based performance labels

simulation labeling

geometric variation

Innovation

Methods, ideas, or system contributions that make the work stand out.

2D latent space augmentation

foundation model

automated simulation labeling