DeepJEB++: Foundation Model-Driven Large-Scale 3D Engineering Dataset via 2D Latent Space Augmentation

📅 2026-06-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of data-driven engineering design caused by the scarcity of large-scale 3D datasets pairing geometry with physical performance, the inadequacy of conventional augmentation methods in preserving fine geometric diversity, and the challenges of automated simulation-based annotation. The authors propose DeepJEB++, a novel framework that first fine-tunes a 2D diffusion model on multi-view renders and leverages a vision-language model (VLM) to filter manufacturable designs. It then reconstructs high-quality meshes using a domain-adapted 3D generative foundation model and automatically identifies interfaces for finite element analysis (FEA) labeling. By innovatively performing efficient geometric augmentation in 2D latent space, integrating VLM-based quality filtering, and enabling end-to-end FEA annotation, the method generates 15,360 high-fidelity simulated 3D scaffolds from fewer than 400 seed designs—achieving a 40× data expansion on a single GPU while demonstrating manufacturing feasibility, label accuracy, and distributional consistency.
📝 Abstract
Data-driven engineering design is constrained by the lack of large-scale 3D datasets that pair geometry with physics-based performance labels. In particular, existing 3D data augmentation techniques have limitations in preserving subtle and diverse geometric variations, and it remains difficult to automate the subsequent simulation-labeling process, where boundary conditions vary depending on the generated geometry. We present DeepJEB++, a foundation-model-driven data-augmentation framework that expands a small seed set of jet engine brackets into a large, simulation-labeled 3D dataset under constrained resources. Our key idea is to augment in the data-rich 2D latent space, then transfer to 3D. In Stage 1, we fine-tune a pretrained 2D latent diffusion model on multi-view renders and synthesize novel views by latent interpolation, retaining manufacturable designs through a vision-language-model (VLM) quality filter. In Stage 2, the validated images are lifted to 3D meshes by a domain-adapted generative foundation model. In Stage 3, an automated pipeline recognizes the load and bolt interfaces on each mesh and assigns finite-element labels -- mass, stress, and displacement -- without manual intervention. We assess augmentation quality along three intrinsic axes: manufacturability, label fidelity against the SimJEB ground truth, and distributional consistency. Starting from fewer than 400 seed designs, DeepJEB++ yields 15,360 simulation-labeled 3D brackets -- a 40x expansion -- using a single GPU per stage. The dataset will be made publicly available to support reproducible engineering-AI research.
Problem

Research questions and friction points this paper is trying to address.

3D engineering dataset
data augmentation
physics-based performance labels
simulation labeling
geometric variation
Innovation

Methods, ideas, or system contributions that make the work stand out.

2D latent space augmentation
foundation model
automated simulation labeling
generative 3D reconstruction
engineering dataset scaling
🔎 Similar Papers
No similar papers found.