MLASDO: a software tool to detect and explain clinical and omics inconsistencies applied to the Parkinson's Progression Markers Initiative cohort

📅 2025-07-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Clinical phenotypes and omics profiles often exhibit discordance in medical cohorts, hindering early diagnosis and precise molecular subtyping. To address this, we propose a three-stage machine learning framework: (1) identifying transcriptomic outliers using support vector machines; (2) integrating clinical annotations and domain knowledge to ensure interpretable classification; and (3) implementing an open-source R package for reproducible analysis. Applied to the Parkinson’s Progression Markers Initiative (PPMI) cohort, our method detected 15 neurologically healthy controls exhibiting Parkinson’s-like transcriptional signatures and 22 Parkinson’s disease patients displaying transcriptionally normal profiles. Subsequent immunophenotyping revealed significantly altered peripheral immune cell proportions in these discordant individuals (p < 0.001). By decoupling molecular profiling from symptomatic presentation, our approach overcomes symptom-dependency limitations and establishes a novel paradigm for pre-symptomatic risk stratification and data-driven molecular subtyping.

Technology Category

Application Category

📝 Abstract
Inconsistencies between clinical and omics data may arise within medical cohorts. The identification, annotation and explanation of anomalous omics-based patients or individuals may become crucial to better reshape the disease, e.g., by detecting early onsets signaled by the omics and undetectable from observable symptoms. Here, we developed MLASDO (Machine Learning based Anomalous Sample Detection on Omics), a new method and software tool to identify, characterize and automatically describe anomalous samples based on omics data. Its workflow is based on three steps: (1) classification of healthy and cases individuals using a support vector machine algorithm; (2) detection of anomalous samples within groups; (3) explanation of anomalous individuals based on clinical data and expert knowledge. We showcase MLASDO using transcriptomics data of 317 healthy controls (HC) and 465 Parkinson's disease (PD) cases from the Parkinson's Progression Markers Initiative. In this cohort, MLASDO detected 15 anomalous HC with a PD-like transcriptomic signature and PD-like clinical features, including a lower proportion of CD4/CD8 naive T-cells and CD4 memory T-cells compared to HC (P<3.5*10^-3). MLASDO also identified 22 anomalous PD cases with a transcriptomic signature more similar to that of HC and some clinical features more similar to HC, including a lower proportion of mature neutrophils compared to PD cases (P<6*10^-3). In summary, MLASDO is a powerful tool that can help the clinician to detect and explain anomalous HC and cases of interest to be followed up. MLASDO is an open-source R package available at: https://github.com/JoseAdrian3/MLASDO.
Problem

Research questions and friction points this paper is trying to address.

Detects inconsistencies between clinical and omics data in medical cohorts
Identifies and explains anomalous samples using omics data
Helps clinicians detect and follow up anomalous cases in diseases like Parkinson's
Innovation

Methods, ideas, or system contributions that make the work stand out.

SVM-based classification of healthy and PD cases
Detection of anomalous samples within groups
Explanation using clinical data and expert knowledge
🔎 Similar Papers
No similar papers found.
J
José A. Pardo
Departamento de Ingeniería de la Información y las Comunicaciones, Universidad de Murcia, Facultad de Informática, Campus Espinardo, Spain
T
Tomás Bernal
Departamento de Ingeniería de la Información y las Comunicaciones, Universidad de Murcia, Facultad de Informática, Campus Espinardo, Spain
J
Jaime Ñiguez
Departamento de Ingeniería de la Información y las Comunicaciones, Universidad de Murcia, Facultad de Informática, Campus Espinardo, Spain
A
Ana Luisa Gil-Martínez
Department of Clinical and Movement Neurosciences, UCL Queen Square Institute of Neurology, London, UK; Movement Disorders Centre, UCL Queen Square Institute of Neurology, London, UK
L
Laura Ibañez
Department of Neurology, Washington University School of Medicine, Saint Louis, MO, 63110, USA
J
José T. Palma
Departamento de Ingeniería de la Información y las Comunicaciones, Universidad de Murcia, Facultad de Informática, Campus Espinardo, Spain
Juan A. Botía
Juan A. Botía
Professor, Universidad de Murcia
Computer Science and Artificial Intelligence
A
Alicia Gómez-Pascual
Departamento de Ingeniería de la Información y las Comunicaciones, Universidad de Murcia, Facultad de Informática, Campus Espinardo, Spain