🤖 AI Summary
In model-based clustering, incomplete prior knowledge and difficulty in specifying appropriate uncertainty over the entire partition hinder robust inference. Method: We propose a locally weighted probabilistic modeling framework that allocates prior uncertainty at the individual observation level—departing from conventional global penalty schemes—and integrates spatiotemporal dependence via a spatiotemporal Gaussian process, coupled with a Bayesian nonparametric random partition model and MCMC inference. Contribution/Results: Our approach enables fine-grained, subset-specific uncertainty encoding, substantially enhancing flexibility and robustness in incorporating expert knowledge. Evaluated on PM₁₀ spatiotemporal data and synthetic experiments, it achieves 12–19% higher clustering accuracy than baseline methods and demonstrates superior robustness to misspecified priors.
📝 Abstract
Model-based clustering is a powerful tool that is often used to discover hidden structure in data by grouping observational units that exhibit similar response values. Recently, clustering methods have been developed that permit incorporating an ``initial'' partition informed by expert opinion. Then, using some similarity criteria, partitions different from the initial one are down weighted, i.e. they are assigned reduced probabilities. These methods represent an exciting new direction of method development in clustering techniques. We add to this literature a method that very flexibly permits assigning varying levels of uncertainty to any subset of the partition. This is particularly useful in practice as there is rarely clear prior information with regards to the entire partition. Our approach is not based on partition penalties but considers individual allocation probabilities for each unit (e.g., locally weighted prior information). We illustrate the gains in prior specification flexibility via simulation studies and an application to a dataset concerning spatio-temporal evolution of ${
m PM}_{10}$ measurements in Germany.