A Generative Adversarial Approach to Adversarial Attacks Guided by Contrastive Language-Image Pre-trained Model

πŸ“… 2025-11-03
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Existing adversarial attacks against multi-label classifiers suffer from perceptible perturbations and poor transferability. Method: This paper proposes a novel generative adversarial attack leveraging CLIP’s semantic alignment capability. It integrates a saliency-guided self-encoding architecture (SSAE) to enforce localized perturbation constraints and incorporates a Gaussian-Augmented Multimodal Alignment (GAMA) mechanism to embed semantically disambiguated textual prompts, thereby constructing a text-guided, semantics-aware loss function. This enables precise manipulation of multi-label predictions while preserving visual fidelity. Contribution/Results: Compared with state-of-the-art methods, the proposed approach achieves comparable or superior attack success rates across multiple black-box models, while significantly improving imperceptibility and cross-model transferability of adversarial examples. It establishes a new paradigm for semantics-driven robustness evaluation of multi-label classification systems.

Technology Category

Application Category

πŸ“ Abstract
The rapid growth of deep learning has brought about powerful models that can handle various tasks, like identifying images and understanding language. However, adversarial attacks, an unnoticed alteration, can deceive models, leading to inaccurate predictions. In this paper, a generative adversarial attack method is proposed that uses the CLIP model to create highly effective and visually imperceptible adversarial perturbations. The CLIP model's ability to align text and image representation helps incorporate natural language semantics with a guided loss to generate effective adversarial examples that look identical to the original inputs. This integration allows extensive scene manipulation, creating perturbations in multi-object environments specifically designed to deceive multilabel classifiers. Our approach integrates the concentrated perturbation strategy from Saliency-based Auto-Encoder (SSAE) with the dissimilar text embeddings similar to Generative Adversarial Multi-Object Scene Attacks (GAMA), resulting in perturbations that both deceive classification models and maintain high structural similarity to the original images. The model was tested on various tasks across diverse black-box victim models. The experimental results show that our method performs competitively, achieving comparable or superior results to existing techniques, while preserving greater visual fidelity.
Problem

Research questions and friction points this paper is trying to address.

Generating imperceptible adversarial perturbations using CLIP model
Deceiving multilabel classifiers in multi-object scene environments
Maintaining high visual similarity while attacking black-box models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Generative adversarial attack using CLIP model
Integrates text semantics for perturbation generation
Combines SSAE and GAMA strategies effectively
S
Sampriti Soor
Department of Computer Science and Engineering, Indian Institute of Technology Guwahati, 781039, Assam, India.
Alik Pramanick
Alik Pramanick
Department of Computer Science and Engineering, Indian Institute of Technology Guwahati, 781039, Assam, India.
J
Jothiprakash K
Department of Computer Science and Engineering, Indian Institute of Technology Guwahati, 781039, Assam, India.
Arijit Sur
Arijit Sur
Professor, Dept. of Computer Science and Engineering, Indian Institute of Technology Guwahati
Computer VisionMachine LearningMedical ImagingAdaptive Video Streaming