CSFlow: Aligning Flow Matching with Human Contrast Sensitivity

📅 2026-06-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing flow-matching generative models overlook the human visual system’s differential sensitivity to spatial frequencies, often yielding outputs with insufficient perceptual realism. This work proposes a novel approach that integrates the human Contrast Sensitivity Function (CSF) into the denoising process of flow matching. Specifically, we introduce a CSF-based, timestep-adaptive weighting strategy that dynamically adjusts loss weights according to the spatial frequency content of the generated image at each diffusion timestep—without modifying the model architecture. This method substantially enhances visual fidelity and mitigates cartoon-like artifacts, achieving consistent improvements across multiple benchmarks: a 4.7% reduction in FID, a 2.2% increase in Inception Score, and a 2.5% gain in GenEval. Additionally, we propose a frequency-domain generation evaluation metric to quantitatively assess spectral fidelity.
📝 Abstract
We introduce Contrast Sensitive Flow (CSFlow), a weighting scheme that connects the human eye's Contrast Sensitivity Function (CSF) to the iterative denoising steps of flow matching. Because real-world images concentrate signal at low spatial frequencies, these components reach high signal-to-noise ratio earlier during continuous diffusion than high-frequency components. When generating images with diffusion or flow matching models, this induces a soft autoregressive structure in Fourier space, where coarse image content stabilizes before fine detail. Meanwhile, the human visual system is unequally sensitive to spatial frequencies: very low and very high frequencies require significantly higher contrast to be perceived. We for the first time merge these observations through two contributions: (1) a metric that estimates which frequencies are generated at each reverse flow interval and (2) timestep weights obtained by aligning the frequencies generated at each noise level with human contrast sensitivity. We validate our contributions experimentally showing that these weights can improve generative performance by lowering FID by 4.7%, increasing Inception Score by 2.2% and improving GenEval scores by 2.5% using inference-only timestep modification or short fine-tuning. Qualitatively, we find that our CSFlow weights lead to better visual realism and less cartoonish appearance of generated images.
Problem

Research questions and friction points this paper is trying to address.

flow matching
contrast sensitivity
spatial frequency
image generation
human visual system
Innovation

Methods, ideas, or system contributions that make the work stand out.

Contrast Sensitivity Function
Flow Matching
Spatial Frequency Weighting
Generative Modeling
Perceptual Alignment
🔎 Similar Papers
No similar papers found.