Decision Theory for the Archetype Discovery Problem

📅 2026-06-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge of summarizing N heterogeneous treatment effects into K representative prototypes to effectively capture covariate-dependent policy impacts. It introduces a novel integration of decision theory with weighted K-means clustering, wherein clustering is performed on either Bayesian posterior means or consistent estimators under a weighted mean squared error criterion. This approach yields a minimum average risk solution given a prior, and an asymptotically near-minimax optimal solution in large samples. The method demonstrably outperforms conventional equal-spaced quantile binning and admits an exact clustering solution computable efficiently via dynamic programming.
📝 Abstract
In the archetype discovery problem a researcher wants to summarize N heterogeneous policy effects of interest that vary over a discrete set of covariates. The goal is to partition the set of covariates into K<N groups -- the archetype sets -- and to provide a summary of the policy effects for each group. We use decision theory to show that, under a weighted mean-squared-error criterion, a procedure analogous to the Sorted Group Average Treatment Effects (GATES) solves the archetype discovery problem. The key difference is that, in the optimal procedure, archetype sets are obtained by weighted K-means clustering of the N heterogeneous policy effects, instead of relying on K equally-spaced quantiles. We show that the procedure that minimizes average risk for a given prior can be obtained by clustering the different values of the posterior mean estimate of the policy effects of interest. Similarly, an approximately minimax procedure in large samples can be obtained by clustering a consistent estimator of the policy effects. In both of these cases, an exact solution to the weighted K-means clustering problem can be found using a simple and well-known dynamic programming algorithm.
Problem

Research questions and friction points this paper is trying to address.

archetype discovery
heterogeneous policy effects
covariate partitioning
group summarization
Innovation

Methods, ideas, or system contributions that make the work stand out.

decision theory
weighted K-means clustering
archetype discovery
posterior mean estimation
minimax procedure
🔎 Similar Papers
No similar papers found.