🤖 AI Summary
Remote sensing image semantic segmentation suffers from incomplete object delineation due to large intra-class variation and high inter-class similarity. To address this, we propose the Prototype-driven Structural Collaboration Network (PSCNet). PSCNet innovatively jointly models invariant class semantics and variant spatial structures, incorporating three key mechanisms: adaptive prototype extraction, semantic–structural hierarchical collaboration optimization, and dynamic channel-wise similarity adjustment—thereby unifying class representations and enhancing discriminability at the feature level. Through hierarchical feature learning and dynamic enhancement, PSCNet significantly improves boundary completeness and inter-class separability. Extensive experiments on multiple mainstream remote sensing benchmarks demonstrate that PSCNet outperforms existing state-of-the-art methods in both segmentation accuracy and structural integrity. The source code is publicly available.
📝 Abstract
In the semantic segmentation of remote sensing images, acquiring complete ground objects is critical for achieving precise analysis. However, this task is severely hindered by two major challenges: high intra-class variance and high inter-class similarity. Traditional methods often yield incomplete segmentation results due to their inability to effectively unify class representations and distinguish between similar features. Even emerging class-guided approaches are limited by coarse class prototype representations and a neglect of target structural information.
Therefore, this paper proposes a Prototype-Driven Structure Synergy Network (PDSSNet). The design of this network is based on a core concept, a complete ground object is jointly defined by its invariant class semantics and its variant spatial structure. To implement this, we have designed three key modules. First, the Adaptive Prototype Extraction Module (APEM) ensures semantic accuracy from the source by encoding the ground truth to extract unbiased class prototypes. Subsequently, the designed Semantic-Structure Coordination Module (SSCM) follows a hierarchical semantics-first, structure-second principle. This involves first establishing a global semantic cognition, then leveraging structural information to constrain and refine the semantic representation, thereby ensuring the integrity of class information. Finally, the Channel Similarity Adjustment Module (CSAM) employs a dynamic step-size adjustment mechanism to focus on discriminative features between classes.
Extensive experiments demonstrate that PDSSNet outperforms state-of-the-art methods. The source code is available at https://github.com/wangjunyi-1/PDSSNet.