🤖 AI Summary
Traditional multistate models impose restrictive linear assumptions between covariates and transition intensities and suffer from high-dimensional computational bottlenecks. To address these limitations, this paper proposes a nonparametric continuous-time Markov chain (CTMC) modeling framework grounded in reproducing kernel Hilbert spaces (RKHS). It is the first to incorporate the RKHS generalized representer theorem into CTMC covariate modeling, thereby relaxing linearity constraints on transition intensities. We develop a dual-path inference strategy: a frequentist path employing RKHS norm regularization for smooth intensity estimation, and a Bayesian path integrating spike-and-slab priors with an enhanced EMVS algorithm for high-dimensional sparse selection. In simulation studies and analysis of follicular lymphoma data, the proposed method reduces normalized error in nonlinear transition function estimation by 37% and significantly improves prediction accuracy of absorption probabilities at terminal states, demonstrating its superior capacity for long-term modeling of complex clinical and behavioral state transitions.
📝 Abstract
We propose a novel nonparametric approach for linking covariates to Continuous Time Markov Chains (CTMCs) using the mathematical framework of Reproducing Kernel Hilbert Spaces (RKHS). CTMCs provide a robust framework for modeling transitions across clinical or behavioral states, but traditional multistate models often rely on linear relationships. In contrast, we use a generalized Representer Theorem to enable tractable inference in functional space. For the Frequentist version, we apply normed square penalties, while for the Bayesian version, we explore sparsity inducing spike and slab priors. Due to the computational challenges posed by high-dimensional spaces, we successfully adapt the Expectation Maximization Variable Selection (EMVS) algorithm to efficiently identify the posterior mode. We demonstrate the effectiveness of our method through extensive simulation studies and an application to follicular cell lymphoma data. Our performance metrics include the normalized difference between estimated and true nonlinear transition functions, as well as the difference in the probability of getting absorbed in one the final states, capturing the ability of our approach to predict long-term behaviors.