🤖 AI Summary
In common outcome scenarios, logistic regression’s odds ratio (OR) substantially deviates from the risk ratio (RR), leading to biased effect interpretation. Method: This paper proposes using complementary log-log (cloglog) regression to directly estimate RR and establishes, for the first time, a rigorous theoretical result: under any outcome prevalence, the “complementary log-ratio” induced by cloglog approximates RR with uniformly smaller absolute bias than OR. We further introduce the Aranda-Ordaz family of link functions to construct a unified theoretical framework enabling comparable effect estimation across models. Contribution/Results: Through analytical error-bound derivation and Monte Carlo simulations, we demonstrate that the proposed approach significantly improves RR estimation accuracy across low-to-high prevalence settings. The method is readily implementable in standard statistical software (e.g., R or SAS), combining theoretical rigor with practical feasibility.
📝 Abstract
Odds ratios obtained from logistic models fail to approximate risk ratios with common outcomes, leading to potential misinterpretations about exposure effects by practitioners. This article investigates the complementary log-log models as a practical alternative to produce risk ratio approximation. We demonstrate that the corresponding effect measure of complementary log-log models, called the complementary log ratio in this article, consistently provides a closer approximation to risk ratios than odds ratios. To compare the approximation accuracy, we adopt the one-parameter Aranda-Ordaz family of link functions, which includes both the logit and complementary log-log link functions as special cases. Within this unified framework, we implement a theoretical comparison of approximation accuracy between the complementary log ratio and the odds ratio, showing that the former always produces smaller approximation bias. Simulation studies further reinforce our theoretical findings. Given that the complementary log-log model is easily implemented in standard statistical software such as R and SAS, we encourage more frequent use of this model as a simple and effective alternative to logistic models when the goal is to approximate risk ratios more accurately.