🤖 AI Summary
To address the sensitivity of parameter estimation to outliers in multivariate Gaussian linear regression, this paper proposes a robust least-squares objective jointly regularized by Mahalanobis distance and ridge penalty. We unify, for the first time, online (stochastic gradient descent with averaging) and offline (fixed-point iteration) robust estimation frameworks under a single theoretical umbrella, and rigorously establish the asymptotic normality of the resulting estimators. Innovatively, we incorporate plug-in robust covariance estimators—such as the Minimum Covariance Determinant (MCD)—thereby substantially enhancing robustness against both misspecified noise covariance and outliers. Experiments on synthetic data demonstrate that our method markedly outperforms classical least squares in outlier resistance; the online variant achieves both high accuracy and computational efficiency. All algorithms are open-sourced and integrated into the CRAN package *RobRegression*.
📝 Abstract
We consider the robust estimation of the parameters of multivariate Gaussian linear regression models. To this aim we consider robust version of the usual (Mahalanobis) least-square criterion, with or without Ridge regularization. We introduce two methods each considered contrast: (i) online stochastic gradient descent algorithms and their averaged versions and (ii) offline fix-point algorithms. Under weak assumptions, we prove the asymptotic normality of the resulting estimates. Because the variance matrix of the noise is usually unknown, we propose to plug a robust estimate of it in the Mahalanobis-based stochastic gradient descent algorithms. We show, on synthetic data, the dramatic gain in terms of robustness of the proposed estimates as compared to the classical least-square ones. Well also show the computational efficiency of the online versions of the proposed algorithms. All the proposed algorithms are implemented in the R package RobRegression available on CRAN.