A Machine Learning-Based Framework for Discovering Huntington's Disease Stages: Integrating Graph Representation Learning and clustering to Uncover Progression Dynamics in Longitudinal Enroll-HD Dataset

📅 2026-06-04
📈 Citations: 0
Influential: 0
📄 PDF

career value

215K/year
🤖 AI Summary
Traditional clinical staging of Huntington’s disease relies on predefined thresholds and expert judgment, which often fails to capture intra-stage heterogeneity and is susceptible to inter-rater variability. This work proposes an unsupervised machine learning framework that, for the first time, integrates dynamic graph representation learning with cluster stability analysis to model both intra- and inter-individual temporal dependencies from 1,477 longitudinal assessments across 302 patients. By applying K-means++ clustering, the approach identifies natural disease progression stages in a data-driven manner, revealing four statistically significant, clinically well-demarcated, and minimally overlapping fine-grained stages. This refined staging scheme transcends the limitations of conventional discrete staging paradigms and offers a more nuanced understanding of disease evolution.
📝 Abstract
Huntington's disease (HD) is a progressive brain disorder that gradually affects movement, cognitive function, and behavior. Identifying the stage of the disease accurately and consistently is important for understanding its course, grouping patients, personalized care, and discovering treatment. Existing clinical staging frameworks rely primarily on predefined clinical measurement thresholds and clinical expert decisions, yet these discrete cut-offs may obscure meaningful intra-stage variability and remain vulnerable to inter-rater differences, especially in motor and functional assessments. To address these limitations, we developed an unsupervised machine learning framework based on dynamic graph representation learning to capture temporal relationships within and across patients from longitudinal clinical measurements. Using the learned representations, we applied K-means++ clustering to identify well-separated groups. We then iteratively increased the number of clusters (k), using stability analysis to assess robustness and reveal additional meaningful clusters beyond the initial optimal solution. We applied the framework to 302 individuals from the Enroll-HD cohort (1,477 visits, 44 clinical variables per visit; 80% manifest participants), enabling data-driven discovery of HD stages reflecting natural clinical progression. Despite the limited cohort size, the proposed framework achieved robust clustering performance using a four-dimensional latent space, identifying four meaningful and statistically distinct disease stages through clustering stability analysis. Each stage corresponded to well-defined clinical measurement boundaries, with minimal overlap compared to previously established clinical staging methods.
Problem

Research questions and friction points this paper is trying to address.

Huntington's disease
disease staging
clinical heterogeneity
inter-rater variability
longitudinal data
Innovation

Methods, ideas, or system contributions that make the work stand out.

graph representation learning
unsupervised clustering
longitudinal data analysis
Huntington's disease staging
clustering stability
🔎 Similar Papers
No similar papers found.
L
Lubna M. Abu Zohair
School of Mathematical and Computer Sciences, Heriot-Watt University, Dubai, United Arab Emirates
M
Marta Vallejo
School of Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh, United Kingdom
M
MD Azher Uddin
School of Mathematical and Computer Sciences, Heriot-Watt University, Dubai, United Arab Emirates
J
John R. Woodward
School of Mathematical and Computer Sciences, Heriot-Watt University, Dubai, United Arab Emirates
Hind Zantout
Hind Zantout
Heriot-Watt University Dubai
EducationMachine Learning in HealthcareCybersecuritySemantic TechnologiesGender Studies