🤖 AI Summary
This work addresses the signal-domain covariate shift in supervised learning for diffusion MRI microstructural parameter estimation, which arises from mismatches between simulated and real data noise characteristics and leads to substantial systematic bias—particularly at low signal-to-noise ratios (SNR). To mitigate this, the authors propose a Realistic Noise Synthesis (RNS) framework that, for the first time, jointly models both the Rician noise expectation and the effective post-processing noise variance in simulated training data. The latter is estimated using Marchenko–Pastur principal component analysis (MPPCA) standard deviation and spherical harmonic residuals. This approach effectively aligns the signal distributions between training and inference phases, significantly reducing estimation bias at low SNR across multiple regression architectures and biophysical models—including Cylinder-Zeppelin and SANDI—while achieving accuracy comparable to or better than noise-aware nonlinear least squares. Validation is provided through multi-SNR simulations and in vivo test–retest acquisitions.
📝 Abstract
Diffusion MRI enables non-invasive probing of tissue microstructure, but accurate parameter estimation is challenged by noise-related effects. In supervised machine learning frameworks trained on simulated data, discrepancies between the noise characteristics of simulated and acquired signals introduce a form of covariate shift, whereby the input signal distribution differs between training and inference. We investigated the impact of this mismatch on microstructure parameter estimation and propose a realistic noise synthesis (RNS) framework to mitigate it. RNS incorporates both the Rician expectation and the effective post-processing noise variance into simulated training signals. The Rician expectation was modelled using a noise standard deviation estimated with MPPCA, while the effective standard deviation was derived from spherical harmonic residuals of preprocessed data. The method was evaluated using the cylinder-zeppelin and the SANDI models on simulated datasets across multiple SNR levels and on in vivo diffusion data with repeated acquisitions. Sensitivity to noise misestimation was also assessed. Ignoring magnitude-induced noise effects during training produced systematic, SNR-dependent parameter bias, particularly at low SNR. Incorporating the Rician expectation substantially reduced bias to the level of noise-aware nonlinear least-squares fitting. Modelling the effective standard deviation further improved precision. Performance was largely independent of regression architecture but sensitive to accurate noise estimation. These findings demonstrate that realistic noise modelling in simulated training data mitigates signal-domain covariate shift and is essential for unbiased supervised microstructure estimation, particularly in low-SNR regimes associated with high b-values or high spatial resolution.