🤖 AI Summary
This work systematically investigates the impact of loss weighting strategies and output parameterizations on model performance in flow matching. Through numerical experiments on both synthetic data with controllable geometric structures and real-world images, the study disentangles their interaction effects across varying data manifold dimensions, model architectures, and dataset scales, using PSNR and FID as evaluation metrics. The analysis reveals, for the first time, how the optimal choice of loss weighting and parameterization depends critically on the intrinsic structure of the data. Building on these insights, the authors formulate practical design principles that substantially improve denoising accuracy and generation quality.
📝 Abstract
We study the training objectives of denoising-based generative models, with a particular focus on loss weighting and output parameterization, including noise-, clean image-, and velocity-based formulations. Through a systematic numerical study, we analyze how these training choices interact with the intrinsic dimensionality of the data manifold, model architecture, and dataset size. Our experiments span synthetic datasets with controlled geometry as well as image data, and compare training objectives using quantitative metrics for denoising accuracy (PSNR across noise levels) and generative quality (FID). Rather than proposing a new method, our goal is to disentangle the various factors that matter when training a flow matching model, in order to provide practical insights on design choices.