🤖 AI Summary
Cross-classified data—common in education, healthcare, and social sciences—pose computational challenges for existing non-Gaussian regression models with crossed random effects, particularly due to high-dimensional integration. To address this, we propose a scalable modeling framework: discretizing the random effects distribution into a finite multi-way grouping structure, yielding a flexible yet computationally tractable multi-way grouped mixture model. The framework integrates generalized linear models—including logistic, Poisson, and ordinal probit regression—to accommodate diverse response types. We establish theoretical guarantees: consistency and asymptotic normality of the estimators. A fast iterative optimization algorithm enables substantial computational speedups—ranging from several-fold to over an order of magnitude—over conventional methods, while preserving statistical efficiency. In both simulations and real-data applications, our approach delivers accurate inference at dramatically reduced cost. This work provides the first general-purpose solution for large-scale crossed random-effects analysis that simultaneously ensures rigorous theoretical foundations and practical scalability.
📝 Abstract
Cross-classified data frequently arise in scientific fields such as education, healthcare, and social sciences. A common modeling strategy is to introduce crossed random effects within a regression framework. However, this approach often encounters serious computational bottlenecks, particularly for non-Gaussian outcomes. In this paper, we propose a scalable and flexible method that approximates the distribution of each random effect by a discrete distribution, effectively partitioning the random effects into a finite number of representative groups. This approximation allows us to express the model as a multi-way grouped structure, which can be efficiently estimated using a simple and fast iterative algorithm. The proposed method accommodates a wide range of outcome models and remains applicable even in settings with more than two-way cross-classification. We theoretically establish the consistency and asymptotic normality of the estimator under general settings of classification levels. Through simulation studies and real data applications, we demonstrate the practical performance of the proposed method in logistic, Poisson, and ordered probit regression models involving cross-classified structures.