🤖 AI Summary
This study addresses the limited flexibility of existing nonlinear mixed-effects modeling tools in handling non-Gaussian responses—such as binary, count, or time-to-event data—in longitudinal studies. We extend the R package saemix to establish a unified modeling framework that, for the first time, accommodates a broad class of non-Gaussian outcomes. Built upon the stochastic approximation expectation–maximization (SAEM) algorithm, the framework enables users to specify custom log-likelihood functions and incorporates an automated joint selection mechanism for covariates and inter-individual variability structures. The implementation also integrates bootstrap-based uncertainty quantification and enhanced diagnostic visualizations. In simulations based on a clinical trial of onychomycosis treatment, the proposed approach robustly and accurately recovers true parameters and demonstrates successful application to both categorical and survival data.
📝 Abstract
Background and Objectives: Longitudinal data are increasingly collected in clinical trials to provide information on treatment action and disease evolution. The trajectory of continuous biomarkers such as target hormone concentrations or viral loads can then be modelled in relationship to the occurrence of events such as recovery or hospitalisation. Other studies may include repeated measurements of discrete pain scores, number of episodes (count) or occurrence of events (survival). Non-linear mixed-effect models (NLMEM) can handle individual differences in trajectories while modelling the underlying population evolution and are the natural choice for their analysis. The saemix package for R is one of the few open-source solutions and the most flexible. In this paper, we extend it to accommodate a variety of models for non-Gaussian data. Methods: The saemix package estimates parameters through the Stochastic Approximation Expectation-Maximisation (SAEM) algorithm. Within the package, non-Gaussian models are specified by their log-likelihood functions, affording maximal control over model formulation. We extend estimation algorithms as well as exploratory and diagnostic plots for non-Gaussian data. Bootstrap approaches were implemented to estimate parameter uncertainty. To evaluate the performance of saemix, we performed a simulation study based on the toenail dataset, containing repeated binary data from a randomised clinical trial. Results: saemix showed good performance to recover the true parameter values in the simulation study, and was stable across different starting values for the parameters. An algorithm jointly searching for covariate and interindividual variability model was also implemented to build the covariate model and applied to categorical and survival-type data.