🤖 AI Summary
This paper addresses modeling challenges—skewness, heavy tails, and temporal dependence—in irregular longitudinal data with matrix-valued response variables. We propose a matrix-variate skew-t response regression model that jointly characterizes multivariate responses and predictors via a matrix-variate distribution: row-wise structure captures subject-specific asynchronous temporal dependence using a damped exponential correlation function, while column-wise covariance remains unstructured. To enhance scalability in high dimensions, we develop an asynchronous distributed ECME algorithm that parallelizes the E-step while retaining closed-form M-steps, significantly improving computational efficiency and convergence speed. We establish theoretical convergence guarantees for the algorithm. Extensive simulations and a real-world electronic health record study on periodontal disease demonstrate its superior estimation accuracy and robustness over competing methods. An open-source R package implementing the methodology is publicly available.
📝 Abstract
We propose a regression model with matrix-variate skew-t response (REGMVST) for analyzing irregular longitudinal data with skewness, symmetry, or heavy tails. REGMVST models matrix-variate responses and predictors, with rows indexing longitudinal measurements per subject. It uses the matrix-variate skew-t (MVST) distribution to handle skewness and heavy tails, a damped exponential correlation (DEC) structure for row-wise dependencies across irregular time profiles, and leaves the column covariance unstructured. For estimation, we initially develop an ECME algorithm for parameter estimation and further mitigate its computational bottleneck via an asynchronous and distributed ECME (ADECME) extension. ADECME accelerates the E-step through parallelization, and retains the simplicity of the conditional M-step, enabling scalable inference. Simulations using synthetic data and a case study exploring matrix-variate periodontal disease endpoints derived from electronic health records demonstrate ADECME's superiority in efficiency and convergence, over the alternatives. We also provide theoretical support for our empirical observations and identify regularity assumptions for ADECME's optimal performance. An accompanying R package is available at https://github.com/rh8liuqy/STMATREG.