A New Integrative Learning Framework for Integrating Multiple Secondary Outcomes into Primary Outcome Analysis: A Case Study on Liver Health

📅 2025-07-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing methods for integrating multiple secondary outcomes (e.g., blood biochemistry, urine biomarkers) to enhance inference on primary liver health outcomes often rely on strong modeling assumptions or prior knowledge to construct over-identified estimating functions, resulting in limited robustness and generalizability. This paper proposes a data-driven integrative learning framework that imposes no pre-specified functional form or stringent distributional assumptions. Leveraging statistical learning theory, it constructs an integrated estimating equation, jointly optimizing variability minimization and computational efficiency to enable adaptive fusion and robust aggregation of heterogeneous secondary outcomes. Simulation studies demonstrate substantial variance reduction in primary outcome estimation. Applied to UK Biobank data, the method robustly identifies a positive association between smoking and fatty liver disease—previously unconfirmed—and reveals significantly stronger effects among older adults. This work establishes a general, assumption-light paradigm for integrative analysis of multi-source biomarkers.

Technology Category

Application Category

📝 Abstract
In the era of big data, secondary outcomes have become increasingly important alongside primary outcomes. These secondary outcomes, which can be derived from traditional endpoints in clinical trials, compound measures, or risk prediction scores, hold the potential to enhance the analysis of primary outcomes. Our method is motivated by the challenge of utilizing multiple secondary outcomes, such as blood biochemistry markers and urine assays, to improve the analysis of the primary outcome related to liver health. Current integration methods often fall short, as they impose strong model assumptions or require prior knowledge to construct over-identified working functions. This paper addresses these statistical challenges and potentially opens a new avenue in data integration by introducing a novel integrative learning framework that is applicable in a general setting. The proposed framework allows for the robust, data-driven integration of information from multiple secondary outcomes, promotes the development of efficient learning algorithms, and ensures optimal use of available data. Extensive simulation studies demonstrate that the proposed method significantly reduces variance in primary outcome analysis, outperforming existing integration approaches. Additionally, applying this method to UK Biobank (UKB) reveals that cigarette smoking is associated with increased fatty liver measures, with these effects being particularly pronounced in the older adult cohort.
Problem

Research questions and friction points this paper is trying to address.

Integrating multiple secondary outcomes to enhance primary outcome analysis
Overcoming limitations of current methods with strong model assumptions
Developing a robust framework for data-driven integration in liver health
Innovation

Methods, ideas, or system contributions that make the work stand out.

Novel integrative learning framework for data integration
Robust data-driven integration of multiple secondary outcomes
Reduces variance in primary outcome analysis significantly
🔎 Similar Papers
No similar papers found.
D
Daxuan Deng
Department of Public Health Sciences, Penn State College of Medicine, Hershey, PA, USA
Peisong Han
Peisong Han
Associate Biostatistics Director, Gilead Sciences
missing data analysisdata integrationempirical likelihoodlongitudinal data analysisclinical trials
S
Shuo Chen
Department of Epidemiology and Public Health, University of Maryland School of Medicine, Baltimore, MD, USA
M
Ming Wang
Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, OH, USA
Chixiang Chen
Chixiang Chen
Associate Professor in Biostatistics, University of Maryland School of Medicine, Baltimore.
Statistics and Biostatistics