π€ AI Summary
This study addresses the need for precise identification of disease subtypes in clinical decision-making by proposing a Bayesian nonparametric clustering method based on the Dirichlet process mixture model. The approach leverages coordinate-ascent variational inference to enable efficient patient stratification, integrating variational inference into a nonparametric Bayesian framework to significantly reduce computational complexity while maintaining high clustering accuracy and mitigating misdiagnosis risks. Experimental results demonstrate that the model accurately recovers ground-truth cluster structures in synthetic data, achieving superior performance in homogeneity and completeness metrics, and exhibits substantially improved computational efficiency compared to conventional Markov chain Monte Carlo (MCMC) methods.
π Abstract
Medical decision-making increasingly requires rapid and reliable assignment of patients to disease subtypes, as many diseases are no longer treated as single entities. For example, cancer patients may be stratified into aggressive and non-aggressive subtypes, with different treatment strategies for each group. We propose a Bayesian nonparametric approach based on a Dirichlet process mixture model for clustering individuals into disease subtypes. We implement a coordinate ascent variational inference algorithm, yielding an effective and computationally efficient alternative to Markov chain Monte Carlo (MCMC), to support medical decision-making. In synthetic experiments, we demonstrate that the proposed approach accurately assigns observations to their ground-truth clusters, achieving strong performance across evaluation metrics, such as homogeneity and completeness. Additionally, we illustrate the proposed approach achieves a substantial improvement in computational cost compared to MCMC, without sacrificing accuracy that would lead to the increased risk of misdiagnosis.