Repeated sampling of different individuals but the same clusters to improve precision of difference-in-differences estimators: the DISC design

📅 2024-11-26

📈 Citations: 0

✨ Influential: 0

career value

184K/year

🤖 AI Summary

To address the infeasibility of cohort designs under high attrition rates and the low estimation precision of difference-in-differences (DID) with multi-period repeated cross-sectional (RCS) data, this paper proposes the DISC (Different Individuals, Same Clusters) sampling design: independent draws of distinct individuals from identical clusters across periods—thereby integrating advantages of both cohort and cross-sectional approaches. We theoretically establish that, under cluster-level random effects, DISC substantially improves DID estimation efficiency. Using a potential outcomes framework, variance decomposition, and intracluster correlation (ICC) analysis, we formally validate its statistical inference validity. Simulation results (n = 1,000) show that DISC reduces DID estimator variance by 56%, 74%, and 86% when ICC = 0.05, 0.1, and 0.2, respectively—yielding up to a 7.3-fold precision gain over conventional RCS. DISC thus provides an efficient, implementable causal inference paradigm for real-world settings where individual tracking is impractical, such as large-scale health surveys and policy evaluations.

Technology Category

Application Category

📝 Abstract

We describe the DISC (Different Individuals, Same Clusters) design, a sampling scheme that can improve the precision of difference-in-differences (DID) estimators in settings involving repeated sampling of a population at multiple time points. Although cohort designs typically lead to more efficient DID estimators relative to repeated cross-sectional (RCS) designs, they are often impractical in practice due to high rates of loss-to-follow-up, individuals leaving the risk set, or other reasons. The DISC design represents a hybrid between a cohort sampling design and a RCS sampling design, an alternative strategy in which the researcher takes a single sample of clusters, but then takes different cross-sectional samples of individuals within each cluster at two or more time points. We show that the DISC design can yield DID estimators with much higher precision relative to a RCS design, particularly if random cluster effects are present in the data-generating mechanism. For example, for a design in which 40 clusters and 25 individuals per cluster are sampled (for a total sample size of n=1,000), the variance of a commonly-used DID treatment effect estimator is 2.3 times higher in the RCS design for an intraclass correlation coefficient (ICC) of 0.05, 3.8 times higher for an ICC of 0.1, and 7.3 times higher for an ICC of 0.2.

Problem

Research questions and friction points this paper is trying to address.

Improving precision of difference-in-differences estimators

Addressing limitations of cohort and cross-sectional designs

Reducing variance through cluster-based sampling strategy

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid sampling design combining cohort and cross-sectional methods

Repeated sampling of different individuals within same clusters

Improves precision of difference-in-differences estimators significantly

🔎 Similar Papers

No similar papers found.

💼 Related Jobs

Research Intern

GE Healthcare

Bellevue, Washington, United States of America, 98004

Research Scientist Intern, Optimization, Privacy and Inference (PhD)