Prompt Categories Cluster for Weakly Supervised Semantic Segmentation

📅 2024-12-18

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

165K/year

🤖 AI Summary

Weakly supervised semantic segmentation (WSSS) relies on image-level labels, yet existing methods overemphasize inter-class discriminability while neglecting semantic sharing among visually or semantically similar categories, leading to ambiguous class activation maps and suboptimal segmentation performance. To address this, we introduce large language models (LLMs) into WSSS for the first time, leveraging prompt engineering to generate interpretable, semantically coherent category clusters that explicitly model inter-class commonalities—not just distinctions. We further propose a unified framework comprising category embedding clustering, relation-aware feature fusion, and multi-stage weakly supervised training to enable pixel-level guidance grounded in shared semantics. Evaluated on PASCAL VOC 2012, our method achieves a new state-of-the-art mIoU of 72.3%, demonstrating significant improvements in both accuracy and robustness. This work validates the effectiveness and generalizability of LLM-driven semantic clustering for WSSS.

Technology Category

Application Category

📝 Abstract

Weakly Supervised Semantic Segmentation (WSSS), which leverages image-level labels, has garnered significant attention due to its cost-effectiveness. The previous methods mainly strengthen the inter-class differences to avoid class semantic ambiguity which may lead to erroneous activation. However, they overlook the positive function of some shared information between similar classes. Categories within the same cluster share some similar features. Allowing the model to recognize these features can further relieve the semantic ambiguity between these classes. To effectively identify and utilize this shared information, in this paper, we introduce a novel WSSS framework called Prompt Categories Clustering (PCC). Specifically, we explore the ability of Large Language Models (LLMs) to derive category clusters through prompts. These clusters effectively represent the intrinsic relationships between categories. By integrating this relational information into the training network, our model is able to better learn the hidden connections between categories. Experimental results demonstrate the effectiveness of our approach, showing its ability to enhance performance on the PASCAL VOC 2012 dataset and surpass existing state-of-the-art methods in WSSS.

Problem

Research questions and friction points this paper is trying to address.

Addresses semantic ambiguity in Weakly Supervised Semantic Segmentation (WSSS).

Leverages shared features between similar classes to improve model performance.

Utilizes Large Language Models (LLMs) for category clustering in WSSS.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses LLMs to derive category clusters via prompts

Integrates relational info into training network

Enhances performance on PASCAL VOC 2012 dataset

🔎 Similar Papers

SemPLeS: Semantic Prompt Learning for Weakly-Supervised Semantic Segmentation