Complete Evasion, Zero Modification: PDF Attacks on AI Text Detection

📅 2025-08-03

📈 Citations: 0

✨ Influential: 0

career value

149K/year

🤖 AI Summary

This work addresses the insufficient robustness of AI-generated text detectors against adversarial evasion attacks. We first identify a previously unrecognized security vulnerability stemming from the inherent inconsistency between visual layout and logical extraction order in PDF documents. To exploit this, we propose PDFuzz—a novel attack that subtly perturbs character spatial positioning (without altering any textual content) to manipulate the underlying PDF structure and thereby disrupt the parsing sequence of text extractors. Crucially, PDFuzz preserves the document’s visual appearance entirely, constituting a zero-content-modification adversarial attack. Evaluations on the ArguGPT detector demonstrate that PDFuzz reduces detection accuracy from 93.6% to 50.4% and drives the F1 score to zero—degrading performance to random guessing. These results empirically validate a systemic fragility of AI detection at the PDF format layer.

Technology Category

Application Category

📝 Abstract

AI-generated text detectors have become essential tools for maintaining content authenticity, yet their robustness against evasion attacks remains questionable. We present PDFuzz, a novel attack that exploits the discrepancy between visual text layout and extraction order in PDF documents. Our method preserves exact textual content while manipulating character positioning to scramble extraction sequences. We evaluate this approach against the ArguGPT detector using a dataset of human and AI-generated text. Our results demonstrate complete evasion: detector performance drops from (93.6 $pm$ 1.4) % accuracy and 0.938 $pm$ 0.014 F1 score to random-level performance ((50.4 $pm$ 3.2) % accuracy, 0.0 F1 score) while maintaining perfect visual fidelity. Our work reveals a vulnerability in current detection systems that is inherent to PDF document structures and underscores the need for implementing sturdy safeguards against such attacks. We make our code publicly available at https://github.com/ACMCMC/PDFuzz.

Problem

Research questions and friction points this paper is trying to address.

Evading AI text detection via PDF layout manipulation

Exploiting PDF text extraction-sequence vulnerabilities

Exposing robustness flaws in current AI detectors

Innovation

Methods, ideas, or system contributions that make the work stand out.

Exploits PDF text layout vs extraction discrepancy

Manipulates character positioning to scramble sequences

Maintains visual fidelity while evading detection

🔎 Similar Papers

SilverSpeak: Evading AI-Generated Text Detectors using Homoglyphs