🤖 AI Summary
Existing methods struggle to effectively evaluate the overall goodness-of-fit and predictive performance of joint models for longitudinal and time-to-event (TTE) data, particularly when event times are subject to censoring. This work extends the normalized prediction distribution errors (NPDE/PD) framework to joint models with censored outcomes for the first time, proposing a unified testing strategy that integrates longitudinal and survival information. The approach handles unobserved event times via uniform imputation weighted by censoring probabilities, computes prediction discrepancies through Monte Carlo simulation, and controls family-wise error using Bonferroni correction. Under various model misspecifications, the method maintains a Type I error rate near 5%, with statistical power increasing alongside sample size and degree of model deviation. Graphical diagnostics further demonstrate high sensitivity to deviations in both survival functions and longitudinal trajectories, such as PSA profiles.
📝 Abstract
Introduction: Joint models are increasingly used in clinical trials. An important part of model building is to properly assess the descriptive and predictive ability of these models. Normalised prediction discrepancies (npd) and normalised prediction distribution errors (npde) have been developed to evaluate graphically and statistically non-linear mixed effect models for continuous responses. In this work, we propose to use a combined test to evaluate joint models.
Methods: Prediction discrepancies (pd) are defined as the quantile of the observation within its predictive distribution and obtained by Monte-Carlo simulations. The pd for unobserved (censored) event times are imputed in a uniform distribution based on the model prediction of the probability of censoring, using a similar method as the one developed to handle data under the lower quantification limit (LOQ). We propose to combine the p-values of the tests on longitudinal data and on time-to-event (TTE) data, adjusted with a Bonferroni correction. We performed simulation studies based on a joint model characterising the relationship between prostate specific antigen biomarker (PSA) and survival in prostate cancer patients to evaluate the type I error and power of npd/npde to detect different types of model misspecifications.
Results: For all types of misspecifications, the type I error of the combined test was found to be close to the expected 5%. The power of the combined test to detect model misspecifications increased with the difference from the true model and as expected, with sample size. Graphically the power increase can be related to larger differences in the shape of the survival function or PSA evolution.
Conclusions: npd can be readily extended for event data by imputing the pd for censored event under the model. The test showed an adequate type I error, and was quite sensitive to alternative models tested.