Explaining AutoClustering: Uncovering Meta-Feature Contribution in AutoML for Clustering

📅 2026-02-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the limited reliability and diagnostic capability of existing AutoClustering systems, which stem from a lack of interpretability regarding how meta-features influence the selection of clustering algorithms and hyperparameters. For the first time, the work systematically reviews the meta-features employed across 22 AutoClustering methods and organizes them into a coherent taxonomy. By integrating global interpretability through decision predicate graphs and local interpretability via SHAP values, the authors conduct a thorough analysis of feature contributions within meta-models. Their investigation uncovers structural biases and consistent patterns in current meta-learning strategies, revealing fundamental limitations of prevailing approaches. These insights not only expose critical shortcomings but also offer actionable interpretability guidelines for designing transparent and trustworthy unsupervised AutoML systems.

Technology Category

Application Category

📝 Abstract
AutoClustering methods aim to automate unsupervised learning tasks, including algorithm selection (AS), hyperparameter optimization (HPO), and pipeline synthesis (PS), by often leveraging meta-learning over dataset meta-features. While these systems often achieve strong performance, their recommendations are often difficult to justify: the influence of dataset meta-features on algorithm and hyperparameter choices is typically not exposed, limiting reliability, bias diagnostics, and efficient meta-feature engineering. This limits reliability and diagnostic insight for further improvements. In this work, we investigate the explainability of the meta-models in AutoClustering. We first review 22 existing methods and organize their meta-features into a structured taxonomy. We then apply a global explainability technique (i.e., Decision Predicate Graphs) to assess feature importance within meta-models from selected frameworks. Finally, we use local explainability tools such as SHAP (SHapley Additive exPlanations) to analyse specific clustering decisions. Our findings highlight consistent patterns in meta-feature relevance, identify structural weaknesses in current meta-learning strategies that can distort recommendations, and provide actionable guidance for more interpretable Automated Machine Learning (AutoML) design. This study therefore offers a practical foundation for increasing decision transparency in unsupervised learning automation.
Problem

Research questions and friction points this paper is trying to address.

AutoClustering
meta-features
explainability
AutoML
unsupervised learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

AutoClustering
Explainable AutoML
Meta-features
SHAP
Decision Predicate Graphs
🔎 Similar Papers
No similar papers found.