🤖 AI Summary
Existing incomplete multi-view unsupervised feature selection (IMUFS) methods struggle with the more prevalent *hybrid missingness*—simultaneous sample-level view absence and feature-level partial missingness—and inadequately balance inter-view consistency and diversity, lacking theoretical foundations. This paper proposes the first joint learning framework tailored to hybrid missingness, built upon nonnegative orthogonal matrix factorization, which simultaneously performs unsupervised feature selection and adaptive data imputation. Specifically, it leverages consensus clustering to uncover global cross-view structural patterns while preserving view-specific local geometric structures. Crucially, it provides the first theoretical analysis of the coupling mechanism between feature selection and imputation. Extensive experiments on eight real-world multi-view datasets demonstrate that the proposed method significantly outperforms state-of-the-art approaches in both clustering accuracy and feature selection quality.
📝 Abstract
Incomplete multi-view unsupervised feature selection (IMUFS), which aims to identify representative features from unlabeled multi-view data containing missing values, has received growing attention in recent years. Despite their promising performance, existing methods face three key challenges: 1) by focusing solely on the view-missing problem, they are not well-suited to the more prevalent mixed-missing scenario in practice, where some samples lack entire views or only partial features within views; 2) insufficient utilization of consistency and diversity across views limits the effectiveness of feature selection; and 3) the lack of theoretical analysis makes it unclear how feature selection and data imputation interact during the joint learning process. Being aware of these, we propose CLIM-FS, a novel IMUFS method designed to address the mixed-missing problem. Specifically, we integrate the imputation of both missing views and variables into a feature selection model based on nonnegative orthogonal matrix factorization, enabling the joint learning of feature selection and adaptive data imputation. Furthermore, we fully leverage consensus cluster structure and cross-view local geometrical structure to enhance the synergistic learning process. We also provide a theoretical analysis to clarify the underlying collaborative mechanism of CLIM-FS. Experimental results on eight real-world multi-view datasets demonstrate that CLIM-FS outperforms state-of-the-art methods.