🤖 AI Summary
Existing latent-space generative models systematically underestimate spectral amplitudes at dissipative scales in turbulent flow synthesis, leading to distorted small-scale structures. This work proposes a spectrally regularized latent flow matching framework that incorporates a region-weighted log-spectral loss during the VAE compression stage, thereby integrating spectral constraints into latent-space training for the first time. Theoretical analysis and experiments demonstrate that encoder-driven reorganization of latent representations is the key mechanism for enhancing small-scale fidelity. Evaluated on 256² DNS data, the method improves reconstructed and unconditionally generated spectral power in the deep dissipative range from 25% and 20% to 94% and 79%, respectively, reduces the DD deviation to −0.117 within only 20 function evaluations, and accurately recovers the signs of second- and third-order structure functions—surpassing the quality ceiling imposed by conventional MSE-based training.
📝 Abstract
Latent diffusion and flow matching have emerged as leading approaches for synthetic turbulence generation, yet they systematically under-represent dissipation-range amplitudes. We introduce a latent flow matching framework with a spectrally regularized compression stage that directly targets this failure mode. On a 256^2 DNS dataset at Re_f \approx 2250, replacing an MSE-trained VAE with a zone-weighted log-spectral objective raises deep-dissipation retained spectral power from 25% to 94% in reconstruction and from 20% to 79% in unconditional generation. The improved latent representation also yields a substantially better sampling cost-fidelity tradeoff: the MSE-trained latent space imposes a fundamental quality ceiling near DD bias -0.70 that no integrator or step-count can overcome, while the spectrally regularized latent space reaches DD bias -0.117 at just 20 function evaluations. Mechanistically, encoder-decoder swap experiments show that the improvement is driven primarily by encoder-induced latent reorganization rather than decoder capacity, while a support-amplitude decomposition reveals that MSE-trained models behave as conservative suppression models, minimizing pointwise error by attenuating intermittent high-wavenumber structure. Both pipelines recover the second-order structure function and the correct sign of S_3, indicating the correct cascade direction without explicit supervision. A small residual gap in the magnitude of S_3 suggests that phase-coherent triadic organization remains a complementary axis to amplitude fidelity for future generative turbulence models.