SCOOTER: A Human Evaluation Framework for Unrestricted Adversarial Examples

📅 2025-07-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing unrestricted adversarial example evaluations lack statistically rigorous human-perception benchmarks, hindering validation of their authenticity and imperceptibility. Method: We introduce the first open-source, statistically empowered human evaluation framework, integrating crowdsourced experimental design, Likert-scale scoring, statistical power analysis, and a GPT-4o pre-screening mechanism to enable large-scale human–machine comparative studies. Contribution/Results: Leveraging >34,000 annotations from 346 participants, we construct an ImageNet-derived dataset comprising 3K clean images and 7K adversarial examples. Our systematic analysis of six mainstream attacks reveals that only four are consistently detectable by GPT-4o, while most remain visually salient to humans—demonstrating a fundamental misalignment between automated detection and human perception. This work establishes a reproducible benchmark and best-practice guidelines for adversarial example evaluation.

Technology Category

Application Category

📝 Abstract
Unrestricted adversarial attacks aim to fool computer vision models without being constrained by $ell_p$-norm bounds to remain imperceptible to humans, for example, by changing an object's color. This allows attackers to circumvent traditional, norm-bounded defense strategies such as adversarial training or certified defense strategies. However, due to their unrestricted nature, there are also no guarantees of norm-based imperceptibility, necessitating human evaluations to verify just how authentic these adversarial examples look. While some related work assesses this vital quality of adversarial attacks, none provide statistically significant insights. This issue necessitates a unified framework that supports and streamlines such an assessment for evaluating and comparing unrestricted attacks. To close this gap, we introduce SCOOTER - an open-source, statistically powered framework for evaluating unrestricted adversarial examples. Our contributions are: $(i)$ best-practice guidelines for crowd-study power, compensation, and Likert equivalence bounds to measure imperceptibility; $(ii)$ the first large-scale human vs. model comparison across 346 human participants showing that three color-space attacks and three diffusion-based attacks fail to produce imperceptible images. Furthermore, we found that GPT-4o can serve as a preliminary test for imperceptibility, but it only consistently detects adversarial examples for four out of six tested attacks; $(iii)$ open-source software tools, including a browser-based task template to collect annotations and analysis scripts in Python and R; $(iv)$ an ImageNet-derived benchmark dataset containing 3K real images, 7K adversarial examples, and over 34K human ratings. Our findings demonstrate that automated vision systems do not align with human perception, reinforcing the need for a ground-truth SCOOTER benchmark.
Problem

Research questions and friction points this paper is trying to address.

Evaluating unrestricted adversarial examples' human imperceptibility
Lack of statistically significant human evaluation frameworks
Misalignment between automated vision systems and human perception
Innovation

Methods, ideas, or system contributions that make the work stand out.

Framework for human evaluation of adversarial examples
Large-scale human vs. model comparison study
Open-source tools and benchmark dataset
🔎 Similar Papers
No similar papers found.