Explaining AutoClustering: Uncovering Meta-Feature Contribution in AutoML for Clustering

📅 2026-02-20

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This study addresses the limited reliability and diagnostic capability of existing AutoClustering systems, which stem from a lack of interpretability regarding how meta-features influence the selection of clustering algorithms and hyperparameters. For the first time, the work systematically reviews the meta-features employed across 22 AutoClustering methods and organizes them into a coherent taxonomy. By integrating global interpretability through decision predicate graphs and local interpretability via SHAP values, the authors conduct a thorough analysis of feature contributions within meta-models. Their investigation uncovers structural biases and consistent patterns in current meta-learning strategies, revealing fundamental limitations of prevailing approaches. These insights not only expose critical shortcomings but also offer actionable interpretability guidelines for designing transparent and trustworthy unsupervised AutoML systems.

Technology Category

Application Category

📝 Abstract

AutoClustering methods aim to automate unsupervised learning tasks, including algorithm selection (AS), hyperparameter optimization (HPO), and pipeline synthesis (PS), by often leveraging meta-learning over dataset meta-features. While these systems often achieve strong performance, their recommendations are often difficult to justify: the influence of dataset meta-features on algorithm and hyperparameter choices is typically not exposed, limiting reliability, bias diagnostics, and efficient meta-feature engineering. This limits reliability and diagnostic insight for further improvements. In this work, we investigate the explainability of the meta-models in AutoClustering. We first review 22 existing methods and organize their meta-features into a structured taxonomy. We then apply a global explainability technique (i.e., Decision Predicate Graphs) to assess feature importance within meta-models from selected frameworks. Finally, we use local explainability tools such as SHAP (SHapley Additive exPlanations) to analyse specific clustering decisions. Our findings highlight consistent patterns in meta-feature relevance, identify structural weaknesses in current meta-learning strategies that can distort recommendations, and provide actionable guidance for more interpretable Automated Machine Learning (AutoML) design. This study therefore offers a practical foundation for increasing decision transparency in unsupervised learning automation.

Problem

Research questions and friction points this paper is trying to address.

AutoClustering

meta-features

explainability

AutoML

unsupervised learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

AutoClustering

Explainable AutoML

Meta-features

SHAP

Decision Predicate Graphs

🔎 Similar Papers

No similar papers found.

Authors to Follow