🤖 AI Summary
This study addresses the challenge of improving parameter estimation accuracy in a target population when individual-level data are unavailable and substantial heterogeneity exists across populations. The authors propose dShrink, a closed-form, tuning-free transfer estimation method that leverages only summary statistics from both the target and external populations—without requiring individual data, explicit modeling assumptions, or covariance matrices. Its key innovations lie in a heterogeneity-adaptive shrinkage strategy and a multi-source information fusion mechanism. Theoretically, dShrink guarantees a lower mean squared error than estimators based solely on target data. Extensive simulations and real-data analyses demonstrate that dShrink substantially enhances estimation accuracy when populations are similar or underlying parameters are near zero, while maintaining robustness under arbitrary heterogeneity structures.
📝 Abstract
Knowledge transfer across data sources holds great promise for improving the estimation of target population parameters by leveraging the growing availability of data from different sources. However, the effectiveness of knowledge transfer is often challenged by the complex and pervasive heterogeneity between data sources and the lack of access to individual-level data. This paper proposes the divide-and-shrink (dShrink) method, a transfer estimation method that estimates target population parameters in a closed form using summary statistics from a target population and some external source populations while accounting for population heterogeneity. The dShrink estimator is guaranteed to outperform the estimator based solely on the target population in terms of expected quadratic error under arbitrary population heterogeneity. The gain can be substantial when the target and source populations are similar, or the underlying true parameter values are near zero. Notably, dShrink is model-free, requires no user-specified tuning parameters, robust to various types of heterogeneity between data sources, and applies to a broad range of parameter estimation problems. dShrink remains effective even when the covariance matrix is not accessible for the external summary statistics and offers flexibility in incorporating side information and summary statistics from multiple source populations. Simulations and real data analyses demonstrate the superior performance of the dShrink estimator and its potential as a robust tool for transfer estimation.