🤖 AI Summary
This work proposes a novel statistic for quantifying heterogeneity among populations of probability measures by leveraging pairwise distance transformations in Wasserstein space. The method constructs an estimator that is unbiased, strongly consistent, and asymptotically normal under only mild moment conditions, enabling effective characterization of within-group heterogeneity and facilitating robust comparisons between two groups as well as identification of outlying samples. Theoretical analysis demonstrates that the proposed estimator is well-suited for plug-in estimation scenarios and possesses favorable statistical properties, accurately capturing key observations that drive heterogeneity. This provides a reliable tool for comparative analysis and diagnostic assessment of data comprising probability measures.
📝 Abstract
Data represented by probability measures arise as empirical distributions, posterior distributions, and feature-based representations of complex objects. We study heterogeneity in a population of probability measures through the expected value of a chosen transform of the pairwise Wasserstein distance. The resulting estimator is unbiased and, under simple moment conditions on the population law, is strongly consistent, asymptotically normal, and equipped with a consistent standard error. This also yields a simple comparison of two populations and remains stable under plug-in approximation when the measures are estimated. The associated empirical eccentricities identify the observations that contribute most strongly to heterogeneity within a sample.