🤖 AI Summary
This study addresses the challenge in continuous biomarker diagnosis that the data-driven nature of the Youden-optimal cutoff induces dependence between sensitivity and specificity estimates, hindering effective joint inference. The authors propose a semiparametric framework based on a density ratio model, integrating maximum empirical likelihood estimation with a range-preserving logit transformation. This approach enables, for the first time, valid joint inference for sensitivity and specificity under the Youden index and establishes their joint asymptotic normality. The method achieves both model robustness and estimation efficiency, overcoming limitations of conventional parametric or nonparametric approaches. Simulation studies demonstrate that the resulting confidence regions attain accurate coverage across various distributions and outperform existing methods in efficiency. Application to COVID-19 antibody data further illustrates its superior performance in diagnostic decision-making.
📝 Abstract
Sensitivity and specificity evaluated at an optimal diagnostic cut-off are fundamental measures of classification accuracy when continuous biomarkers are used for disease diagnosis. Joint inference for these quantities is challenging because their estimators are evaluated at a common, data-driven threshold estimated from both diseased and healthy samples, inducing statistical dependence. Existing approaches are largely based on parametric assumptions or fully nonparametric procedures, which may be sensitive to model misspecification or lack efficiency in moderate samples. We propose a semiparametric framework for joint inference on sensitivity and specificity at the Youden-optimal cut-off under the density ratio model. Using maximum empirical likelihood, we derive estimators of the optimal threshold and the corresponding sensitivity and specificity, and establish their joint asymptotic normality. This leads to Wald-type and range-preserving logit-transformed confidence regions. Simulation studies show that the proposed method achieves accurate coverage with improved efficiency relative to existing parametric and nonparametric alternatives across a variety of distributional settings. An analysis of COVID-19 antibody data demonstrates the practical advantages of the proposed approach for diagnostic decision-making.