🤖 AI Summary
In single-shot quantitative phase microscopy (ssQPM), cell segmentation is hindered by noise, high cell density, poor robustness of conventional thresholding methods, and the inability of naive multimodal concatenation to capture cross-modal complementarity. To address these challenges, we propose a Dual-Encoder Attention Fusion Network (DAF-Net). DAF-Net employs two independent encoders to extract features from polarization-intensity images and quantitative phase maps, respectively; integrates a multi-head attention mechanism for content-aware feature-level fusion; and incorporates dual-source skip connections and modality normalization to enhance cross-modal complementarity and training stability. Extensive experiments under varying noise levels and cell densities demonstrate that DAF-Net consistently outperforms unimodal and simple concatenation baselines, achieving superior segmentation accuracy and robustness. The framework establishes a generalizable multimodal learning paradigm for ssQPM-based cellular analysis.
📝 Abstract
Cell segmentation in single-shot quantitative phase microscopy (ssQPM) faces challenges from traditional thresholding methods that are sensitive to noise and cell density, while deep learning approaches using simple channel concatenation fail to exploit the complementary nature of polarized intensity images and phase maps. We introduce DM-QPMNet, a dual-encoder network that treats these as distinct modalities with separate encoding streams. Our architecture fuses modality-specific features at intermediate depth via multi-head attention, enabling polarized edge and texture representations to selectively integrate complementary phase information. This content-aware fusion preserves training stability while adding principled multi-modal integration through dual-source skip connections and per-modality normalization at minimal overhead. Our approach demonstrates substantial improvements over monolithic concatenation and single-modality baselines, showing that modality-specific encoding with learnable fusion effectively exploits ssQPM's simultaneous capture of complementary illumination and phase cues for robust cell segmentation.