đ¤ AI Summary
Accurate molecular subtyping of spitzoid tumors (STs) is critical for prognosis assessment and therapeutic decision-making; however, clinical DNA methylation data frequently suffer from high-dimensional missingness due to insufficient sequencing coverage and technical artifacts, severely limiting the performance of existing classification models. To address this, we propose ReMACâthe first masked autoencoder framework specifically designed for tumor methylation analysis. ReMAC unifies representation learning, missing-value imputation, and end-to-end classification within a single architecture, enabling robust modeling of both complete and incomplete methylation profiles. Notably, it introduces masked autoencodingâa paradigm previously unexplored in epigenetic classificationâto jointly reconstruct masked features and optimize discriminative subtyping. Evaluated on real-world clinical ST datasets, ReMAC achieves significant improvements in subtype classification accuracy and robustness over state-of-the-art methods. The implementation is publicly available.
đ Abstract
Accurate diagnosis of spitzoid tumors (ST) is critical to ensure a favorable prognosis and to avoid both under- and over-treatment. Epigenetic data, particularly DNA methylation, provide a valuable source of information for this task. However, prior studies assume complete data, an unrealistic setting as methylation profiles frequently contain missing entries due to limited coverage and experimental artifacts. Our work challenges these favorable scenarios and introduces ReMAC, an extension of ReMasker designed to tackle classification tasks on high-dimensional data under complete and incomplete regimes. Evaluation on real clinical data demonstrates that ReMAC achieves strong and robust performance compared to competing classification methods in the stratification of ST. Code is available: https://github.com/roshni-mahtani/ReMAC.