A Latent Variable Approach to Learning High-dimensional Multivariate longitudinal Data

📅 2024-05-23
📈 Citations: 2
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the challenges of covariate effect inference and future outcome prediction in high-dimensional multivariate longitudinal data—characterized by complex dependencies among variables and over time, mixed-type outcomes (continuous and discrete), and irregular observation patterns (missingness or right-censoring). We propose a novel latent-variable modeling framework that (i) unifies the treatment of mixed-type responses and irregular temporal observations for the first time; (ii) introduces an information criterion tailored to high-dimensional longitudinal settings for automatic selection of the latent factor dimension; and (iii) establishes a rigorous central limit theorem for regression coefficient estimators, ensuring valid statistical inference. Evaluated on a customer shopping behavior prediction task, our method significantly improves long-term trend modeling accuracy and robustness of personalized forecasting, demonstrating both practical utility and theoretical soundness in real-world high-dimensional longitudinal applications.

Technology Category

Application Category

📝 Abstract
High-dimensional multivariate longitudinal data, which arise when many outcome variables are measured repeatedly over time, are becoming increasingly common in social, behavioral and health sciences. We propose a latent variable model for drawing statistical inferences on covariate effects and predicting future outcomes based on high-dimensional multivariate longitudinal data. This model introduces unobserved factors to account for the between-variable and across-time dependence and assist the prediction. Statistical inference and prediction tools are developed under a general setting that allows outcome variables to be of mixed types and possibly unobserved for certain time points, for example, due to right censoring. A central limit theorem is established for drawing statistical inferences on regression coefficients. Additionally, an information criterion is introduced to choose the number of factors. The proposed model is applied to customer grocery shopping records to predict and understand shopping behavior.
Problem

Research questions and friction points this paper is trying to address.

Modeling high-dimensional multivariate longitudinal data dependencies
Predicting future outcomes with mixed-type and missing data
Analyzing covariate effects using latent variable approach
Innovation

Methods, ideas, or system contributions that make the work stand out.

Latent variable model for high-dimensional data
Factors for between-variable and time dependence
Information criterion to select factor number
🔎 Similar Papers
No similar papers found.