Bayesian Clustering Factor Models

📅 2025-05-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
High-dimensional heterogeneous healthcare data—such as those from opioid use disorder (OUD) recovery studies—pose challenges in jointly performing dimensionality reduction and clustering. Method: We propose a Bayesian joint factor-clustering model that employs a Gaussian mixture prior over latent variables and establishes a hierarchical factor analysis framework to simultaneously achieve dimensionality reduction and subject-level clustering. Contribution/Results: This work is the first to unify factor modeling with Bayesian clustering; it introduces a novel information criterion that jointly selects the optimal number of clusters and latent factors, and quantifies inference uncertainty explicitly via Gibbs sampling. Simulation studies demonstrate substantially improved structural recovery accuracy over existing methods. Applied to real OUD recovery data, the model successfully identifies clinically meaningful recovery subgroups, yielding interpretable, actionable clusters for personalized intervention strategies.

Technology Category

Application Category

📝 Abstract
We present a novel framework for concomitant dimension reduction and clustering. This framework is based on a novel class of Bayesian clustering factor models. These models assume a factor model structure where the vectors of common factors follow a mixture of Gaussian distributions. We develop a Gibbs sampler to explore the posterior distribution and propose an information criterion to select the number of clusters and the number of factors. Simulation studies show that our inferential approach appropriately quantifies uncertainty. In addition, when compared to a previously published competitor method, our information criterion has favorable performance in terms of correct selection of number of clusters and number of factors. Finally, we illustrate the capabilities of our framework with an application to data on recovery from opioid use disorder where clustering of individuals may facilitate personalized health care.
Problem

Research questions and friction points this paper is trying to address.

Develop Bayesian clustering factor models for dimension reduction and clustering
Propose Gibbs sampler and information criterion for cluster and factor selection
Apply framework to opioid recovery data for personalized healthcare clustering
Innovation

Methods, ideas, or system contributions that make the work stand out.

Bayesian clustering factor models framework
Gibbs sampler for posterior distribution
Information criterion for cluster selection
🔎 Similar Papers
No similar papers found.
H
H. Shin
Henry Ford Health, Detroit, MI, 48202, USA
Marco A. R. Ferreira
Marco A. R. Ferreira
Professor of Statistics, Virginia Tech
Bayesian methodstime series analysisspatial dataspatio-temporal modeling
A
A. Tegge
Fralin Biomedical Research Institute, Virginia Tech, Roanoke, Virginia, 24016, USA