๐ค AI Summary
Existing multi-view unsupervised feature selection (MUFS) methods rely on spurious correlations between features and clustering labels, rendering them vulnerable to confounding factorsโleading to the selection of pseudo-correlated or irrelevant features. To address this, this paper pioneers a causal inference perspective for MUFS and proposes CAUSA: a novel framework that introduces a causal regularization module to disentangle confounding effects, learns view-invariant sample weights, and integrates generalized unsupervised spectral regression with adaptive distribution balancing to precisely identify causal features driving consensus clustering. Extensive experiments across multiple benchmark datasets demonstrate that CAUSA significantly outperforms state-of-the-art methods, effectively eliminating spurious correlations while enhancing feature discriminability and selection reliability.
๐ Abstract
Multi-view unsupervised feature selection (MUFS) has recently received increasing attention for its promising ability in dimensionality reduction on multi-view unlabeled data. Existing MUFS methods typically select discriminative features by capturing correlations between features and clustering labels. However, an important yet underexplored question remains: extit{Are such correlations sufficiently reliable to guide feature selection?} In this paper, we analyze MUFS from a causal perspective by introducing a novel structural causal model, which reveals that existing methods may select irrelevant features because they overlook spurious correlations caused by confounders. Building on this causal perspective, we propose a novel MUFS method called CAusal multi-view Unsupervised feature Selection leArning (CAUSA). Specifically, we first employ a generalized unsupervised spectral regression model that identifies informative features by capturing dependencies between features and consensus clustering labels. We then introduce a causal regularization module that can adaptively separate confounders from multi-view data and simultaneously learn view-shared sample weights to balance confounder distributions, thereby mitigating spurious correlations. Thereafter, integrating both into a unified learning framework enables CAUSA to select causally informative features. Comprehensive experiments demonstrate that CAUSA outperforms several state-of-the-art methods. To our knowledge, this is the first in-depth study of causal multi-view feature selection in the unsupervised setting.