🤖 AI Summary
This study addresses the challenge of modeling ICU escalation risk from highly heterogeneous chest X-ray data in COVID-19 patients, a task poorly suited to conventional single-kernel methods. To this end, we propose GLIMARK (Generalized Linear Multi-kernel Additive Regression Framework), which extends multi-kernel learning (MKL) beyond its traditional limitation to continuous outputs by accommodating exponential family response variables. GLIMARK integrates multiple base kernels into a composite kernel and leverages an additive structure to simultaneously capture nonlinear patterns and yield clinically interpretable features. Evaluated on real-world COVID-19 chest radiographs, the framework effectively approximates the underlying data-generating mechanism, accurately predicts binary ICU escalation outcomes, and identifies imaging features with clear clinical relevance.
📝 Abstract
Kernel methods have been extensively utilized in machine learning for classification and prediction tasks due to their ability to capture complex non-linear data patterns. However, single kernel approaches are inherently limited, as they rely on a single type of kernel function (e.g., Gaussian kernel), which may be insufficient to fully represent the heterogeneity or multifaceted nature of real-world data. Multiple kernel learning (MKL) addresses these limitations by constructing composite kernels from simpler ones and integrating information from heterogeneous sources. Despite these advances, traditional MKL methods are primarily designed for continuous outcomes. We extend MKL to accommodate the outcome variable belonging to the exponential family, representing a broader variety of data types, and refer to our proposed method as generalized linear models with integrated multiple additive regression with kernels (GLIMARK). Empirically, we demonstrate that GLIMARK can effectively recover or approximate the true data-generating mechanism. We have applied it to a COVID-19 chest X-ray dataset, predicting binary outcomes of ICU escalation and extracting clinically meaningful features, underscoring the practical utility of this approach in real-world scenarios.