🤖 AI Summary
This work addresses the challenge of missing modalities in multimodal learning, particularly in biological sciences where heterogeneous data are often incomplete. Instead of explicitly imputing missing modalities, the authors propose an availability-aware fusion approach that maps each modality into a shared latent space, treating them as partial observations of an underlying latent state. By leveraging modality-specific embeddings, neighborhood-guided latent alignment, and a dynamic fusion mechanism, the method constructs a unified representation using only the available modalities. Experiments on real-world incomplete multi-omics datasets demonstrate that the proposed approach significantly improves performance on downstream tasks such as cancer phenotype classification and survival prediction, exhibiting strong robustness to missing modalities.
📝 Abstract
We study multimodal learning under missing modalities, with particular motivation from bioscience applications in which heterogeneous modalities are often only partially available when decisions need to be made. We propose Latent World Recovery (LWR), a framework built on two key ideas: (i) modality-specific embeddings from different modalities are aligned in a shared latent space, and (ii) a unified representation is constructed by fusing only the embeddings of the modalities that are actually available at both training and inference time. Rather than imputing missing modalities or requiring a fixed modality set, LWR treats each modality as a partial perception of an underlying latent state and performs availability-aware representation learning directly from the observed modalities. This combination of neighbor-based latent alignment and availability-aware modality fusion enables robust multimodal prediction under partial observation, while avoiding error propagation from explicit reconstruction of missing modalities. We evaluate the proposed framework on real-world incomplete multi-omics benchmarks and demonstrate that it provides an effective approach to downstream tasks such as cancer phenotype classification and survival prediction.