Who Wrote This? The Key to Zero-Shot LLM-Generated Text Detection Is GECScore

๐Ÿ“… 2024-05-07
๐Ÿ›๏ธ International Conference on Computational Linguistics
๐Ÿ“ˆ Citations: 3
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This paper addresses the challenge of detecting LLM-generated text in the absence of large-scale labeled data. We propose GECScore, a training-free, source-model-agnostic black-box zero-shot detection method. Its core insight is that human-written text exhibits systematically higher grammatical error rates than LLM-generated text. GECScore leverages pre-trained grammatical error correction (GEC) models (e.g., BART or PIE) to compute the probability difference between original and corrected textโ€”requiring no fine-tuning or supervision. Lightweight, model-agnostic, and robust to paraphrasing attacks, GECScore achieves an average AUROC of 98.62% on XSum and Writing Prompts benchmarks, outperforming existing zero-shot and supervised methods. It further demonstrates strong generalization in real-world settings and under adversarial paraphrasing.

Technology Category

Application Category

๐Ÿ“ Abstract
The efficacy of detectors for texts generated by large language models (LLMs) substantially depends on the availability of large-scale training data. However, white-box zero-shot detectors, which require no such data, are limited by the accessibility of the source model of the LLM-generated text. In this paper, we propose a simple yet effective black-box zero-shot detection approach based on the observation that, from the perspective of LLMs, human-written texts typically contain more grammatical errors than LLM-generated texts. This approach involves calculating the Grammar Error Correction Score (GECScore) for the given text to differentiate between human-written and LLM-generated text. Experimental results show that our method outperforms current state-of-the-art (SOTA) zero-shot and supervised methods, achieving an average AUROC of 98.62% across XSum and Writing Prompts dataset. Additionally, our approach demonstrates strong reliability in the wild, exhibiting robust generalization and resistance to paraphrasing attacks. Data and code are available at: https://github.com/NLP2CT/GECScore.
Problem

Research questions and friction points this paper is trying to address.

Detecting LLM-generated text without training data.
Differentiating human-written and LLM-generated text using GECScore.
Improving zero-shot detection accuracy and robustness.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Black-box zero-shot detection using GECScore
Differentiates human-written and LLM-generated text
Outperforms SOTA methods with 98.62% AUROC
๐Ÿ”Ž Similar Papers
J
Junchao Wu
NLP2CT Lab, Department of Computer and Information Science, University of Macau
Runzhe Zhan
Runzhe Zhan
Ph.D. Candidate, University of Macau
Machine TranslationLanguage ModelsMultilinguality
Derek F. Wong
Derek F. Wong
Professor, Department of Computer and Information Science, University of Macau
Machine TranslationNeural Machine TranslationNatural Language ProcessingMachine Learning
S
Shu Yang
NLP2CT Lab, Department of Computer and Information Science, University of Macau
X
Xuebo Liu
Institute of Computing and Intelligence, Harbin Institute of Technology, Shenzhen, China
Lidia S. Chao
Lidia S. Chao
University of Macau
M
Min Zhang
Institute of Computing and Intelligence, Harbin Institute of Technology, Shenzhen, China