🤖 AI Summary
To address the challenges of view incompleteness, label sparsity, and the trade-off between representation consistency and specificity in multi-view multi-label classification (MvMLC), this paper proposes the first multi-view representation disentanglement framework tailored for incomplete MvMLC. The method decomposes features into orthogonal components: view-invariant and view-specific representations. It introduces a graph-based disentanglement loss and a three-stage task-relevant consistency learning mechanism. Technically, the framework integrates graph neural networks, masked cross-view prediction (MCP), information-theoretic constraints—via mutual information maximization (for shared semantics) and minimization (for view-specificity)—and robust modeling of missing views. Evaluated on five benchmark datasets, our approach significantly outperforms state-of-the-art methods. It demonstrates strong robustness to simultaneous view and label incompleteness while simultaneously enhancing both model interpretability and classification accuracy.
📝 Abstract
Multi-view multi-label classification (MvMLC) has recently garnered significant research attention due to its wide range of real-world applications. However, incompleteness in views and labels is a common challenge, often resulting from data collection oversights and uncertainties in manual annotation. Furthermore, the task of learning robust multi-view representations that are both view-consistent and view-specific from diverse views still a challenge problem in MvMLC. To address these issues, we propose a novel framework for incomplete multi-view multi-label classification (iMvMLC). Our method factorizes multi-view representations into two independent sets of factors: view-consistent and view-specific, and we correspondingly design a graph disentangling loss to fully reduce redundancy between these representations. Additionally, our framework innovatively decomposes consistent representation learning into three key sub-objectives: (i) how to extract view-shared information across different views, (ii) how to eliminate intra-view redundancy in consistent representations, and (iii) how to preserve task-relevant information. To this end, we design a robust task-relevant consistency learning module that collaboratively learns high-quality consistent representations, leveraging a masked cross-view prediction (MCP) strategy and information theory. Notably, all modules in our framework are developed to function effectively under conditions of incomplete views and labels, making our method adaptable to various multi-view and multi-label datasets. Extensive experiments on five datasets demonstrate that our method outperforms other leading approaches.