Beyond Correlation: Causal Multi-View Unsupervised Feature Selection Learning

๐Ÿ“… 2025-09-17
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Existing multi-view unsupervised feature selection (MUFS) methods rely on spurious correlations between features and clustering labels, rendering them vulnerable to confounding factorsโ€”leading to the selection of pseudo-correlated or irrelevant features. To address this, this paper pioneers a causal inference perspective for MUFS and proposes CAUSA: a novel framework that introduces a causal regularization module to disentangle confounding effects, learns view-invariant sample weights, and integrates generalized unsupervised spectral regression with adaptive distribution balancing to precisely identify causal features driving consensus clustering. Extensive experiments across multiple benchmark datasets demonstrate that CAUSA significantly outperforms state-of-the-art methods, effectively eliminating spurious correlations while enhancing feature discriminability and selection reliability.

Technology Category

Application Category

๐Ÿ“ Abstract
Multi-view unsupervised feature selection (MUFS) has recently received increasing attention for its promising ability in dimensionality reduction on multi-view unlabeled data. Existing MUFS methods typically select discriminative features by capturing correlations between features and clustering labels. However, an important yet underexplored question remains: extit{Are such correlations sufficiently reliable to guide feature selection?} In this paper, we analyze MUFS from a causal perspective by introducing a novel structural causal model, which reveals that existing methods may select irrelevant features because they overlook spurious correlations caused by confounders. Building on this causal perspective, we propose a novel MUFS method called CAusal multi-view Unsupervised feature Selection leArning (CAUSA). Specifically, we first employ a generalized unsupervised spectral regression model that identifies informative features by capturing dependencies between features and consensus clustering labels. We then introduce a causal regularization module that can adaptively separate confounders from multi-view data and simultaneously learn view-shared sample weights to balance confounder distributions, thereby mitigating spurious correlations. Thereafter, integrating both into a unified learning framework enables CAUSA to select causally informative features. Comprehensive experiments demonstrate that CAUSA outperforms several state-of-the-art methods. To our knowledge, this is the first in-depth study of causal multi-view feature selection in the unsupervised setting.
Problem

Research questions and friction points this paper is trying to address.

Addressing spurious correlations in multi-view unsupervised feature selection
Proposing causal model to identify confounder-induced irrelevant features
Developing regularization to mitigate spurious correlations across views
Innovation

Methods, ideas, or system contributions that make the work stand out.

Causal structural model identifies confounders
Spectral regression captures feature-label dependencies
Causal regularization mitigates spurious feature correlations
๐Ÿ”Ž Similar Papers
No similar papers found.
Z
Zongxin Shen
Joint Laboratory of Data Science and Business Intelligence, School of Statistics and Data Science, Southwestern University of Finance and Economics, Chengdu 611130, China
Yanyong Huang
Yanyong Huang
Southwestern University of Finance and Economics
Machine LearningData MiningUrban ComputingGranular Computing
B
Bin Wang
College of Information Science and Engineering, Ocean University of China, Qingdao 266100, China
J
Jinyuan Chang
Joint Laboratory of Data Science and Business Intelligence, School of Statistics and Data Science, Southwestern University of Finance and Economics, Chengdu 611130, China
Shiyu Liu
Shiyu Liu
University of Electronic Science and Technology of China
Statistical Machine LearningFederated Learning
Tianrui Li
Tianrui Li
School of Computing and Artificial Intelligence, Southwest Jiaotong University
Big Data IntelligenceUrban ComputingGranular Computing