🤖 AI Summary
This work addresses the challenge of preserving user-level privacy in longitudinal data analysis, where traditional record-level differential privacy fails to protect entire user trajectories, and existing user-level differential privacy (user-level DP) methods lack rigorous inferential guarantees. The authors propose the first unified user-level DP framework for longitudinal linear regression, which aggregates local regression estimates and introduces a bias-corrected private covariance estimator that automatically adapts to heteroskedasticity and autocorrelation structures. The method enables valid statistical inference under strong privacy constraints. Theoretical analysis establishes both finite-sample performance guarantees and asymptotic normality of the estimators. Empirical evaluations demonstrate that the approach maintains high statistical efficiency and accurate inference even under stringent user-level privacy requirements.
📝 Abstract
Differential Privacy (DP) provides a rigorous framework for releasing statistics while protecting individual information present in a dataset. Although substantial progress has been made on differentially private linear regression, existing methods almost exclusively address the item-level DP setting, where each user contributes a single observation. Many scientific and economic applications instead involve longitudinal or panel data, in which each user contributes multiple dependent observations. In these settings, item-level DP offers inadequate protection, and user-level DP - shielding an individual's entire trajectory - is the appropriate privacy notion. We develop a comprehensive framework for estimation and inference in longitudinal linear regression under user-level DP. We propose a user-level private regression estimator based on aggregating local regressions, and we establish finite-sample guarantees and asymptotic normality under short-range dependence. For inference, we develop a privatized, bias-corrected covariance estimator that is automatically heteroskedasticity- and autocorrelation-consistent. These results provide the first unified framework for practical user-level DP estimation and inference in longitudinal linear regression under dependence, with strong theoretical guarantees and promising empirical performance.