🤖 AI Summary
Short text classification faces challenges including semantic sparsity, scarce labeled data, and limitations of existing methods—particularly their reliance on data augmentation, which often induces semantic distortion and hinders effective multi-view information integration. To address these issues, this paper proposes SimSTC, a graph contrastive learning framework that eliminates explicit data augmentation. SimSTC constructs a heterogeneous text graph comprising three components: word co-occurrence, document-word associations, and document similarity. A graph neural network encodes this structure to produce multi-view embeddings, and InfoNCE loss is optimized directly under implicit multi-view consistency constraints. Notably, SimSTC is the first method to discard data augmentation entirely in short text classification, substituting manual perturbations with lightweight, structure-aware graph modeling to jointly enhance discriminability and robustness. Extensive experiments on multiple benchmark datasets demonstrate that SimSTC achieves state-of-the-art performance with significantly fewer parameters than leading GNNs and large language models.
📝 Abstract
Short text classification has gained significant attention in the information age due to its prevalence and real-world applications. Recent advancements in graph learning combined with contrastive learning have shown promising results in addressing the challenges of semantic sparsity and limited labeled data in short text classification. However, existing models have certain limitations. They rely on explicit data augmentation techniques to generate contrastive views, resulting in semantic corruption and noise. Additionally, these models only focus on learning the intrinsic consistency between the generated views, neglecting valuable discriminative information from other potential views. To address these issues, we propose a Simple graph contrastive learning framework for Short Text Classification (SimSTC). Our approach involves performing graph learning on multiple text-related component graphs to obtain multi-view text embeddings. Subsequently, we directly apply contrastive learning on these embeddings. Notably, our method eliminates the need for data augmentation operations to generate contrastive views while still leveraging the benefits of multi-view contrastive learning. Despite its simplicity, our model achieves outstanding performance, surpassing large language models on various datasets.