🤖 AI Summary
This work addresses a critical limitation in existing incomplete multi-view clustering (IMVC) evaluation paradigms, which rely on retraining models for each missing pattern and assess data incompleteness solely by missing rate—often misjudging model robustness. The authors introduce the concept of “incompleteness divergence,” revealing substantial variation in the proportion of fully observed samples even under identical missing rates, and demonstrate that conventional reconstruction objectives fail when this proportion falls below a certain threshold. To overcome this, they propose CRAFT, a sample-independent, mask-aware variable-length fusion attention Transformer that generalizes across diverse missing patterns through a single training run. Evaluated on seven benchmarks, CRAFT matches or surpasses per-configuration trained baselines while reducing training overhead by 8.8×, validating a new paradigm that embeds robustness into architecture rather than loss functions.
📝 Abstract
Standard IMVC evaluation retrains separate models for different missing-data configurations. We show that this paradigm obscures a fundamental vulnerability: missing rate alone is insufficient to characterize data incompleteness. Specifically, we show that protocols with identical nominal missing rates can differ by up to $50\times$ in their proportion of fully observed samples, inducing drastically different learning regimes. We formalize this phenomenon as incompleteness divergence, providing measures that capture structural disparities across missing-data protocols. We further prove that for a broad class of reconstruction-based objectives, learning becomes structurally ill-posed when the proportion of complete samples falls below a critical threshold, leading to near-random performance. To bypass this theoretical bound, we propose CRAFT (Complete-data Robust Attention-masked Fusion Transformer). CRAFT shifts the burden of robustness from the loss function to the architecture via two key properties: (i) per-sample independence, which removes reliance on complete-sample co-occurrence, and (ii) mask-aware variable-length fusion, which aggregates only observed views through attention masking. This design allows a single model, trained once on complete data, to generalize to diverse missing patterns at inference time without retraining. Extensive experiments on seven benchmarks show that CRAFT matches or outperforms per-configuration baselines while reducing training overhead by $8.8\times$, demonstrating that robustness to missing data can be achieved as an inherent architectural property. Code (CRAFT) and our imvc-audit toolkit are available at https://anonymous.4open.science/r/CRAFT-BF80/ and https://anonymous.4open.science/r/imvc-audit-8263/.