DNA-DetectLLM: Unveiling AI-Generated Text via a DNA-Inspired Mutation-Repair Paradigm

📅 2025-09-18

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

AI-generated and human-written texts exhibit substantial feature overlap, causing declining accuracy and poor interpretability in existing detection methods. Method: We propose a zero-shot detection framework inspired by DNA mutation repair: treating AI-generated text as a sequence containing “mutations,” we iteratively repair tokens under language model probability guidance, accumulating repair cost as an interpretable discriminative signal—requiring no fine-tuning or training data. Contribution/Results: Our approach directly quantifies generation divergence and achieves state-of-the-art performance across multiple benchmarks, improving AUROC by 5.55% and F1 by 2.08%. It demonstrates strong robustness against adversarial attacks, cross-model generalization, and varying text lengths. The core innovation lies in modeling biological repair mechanisms as an interpretable sequence optimization process, establishing a novel paradigm for AI-text detection in high-overlap regimes.

Technology Category

Application Category

📝 Abstract

The rapid advancement of large language models (LLMs) has blurred the line between AI-generated and human-written text. This progress brings societal risks such as misinformation, authorship ambiguity, and intellectual property concerns, highlighting the urgent need for reliable AI-generated text detection methods. However, recent advances in generative language modeling have resulted in significant overlap between the feature distributions of human-written and AI-generated text, blurring classification boundaries and making accurate detection increasingly challenging. To address the above challenges, we propose a DNA-inspired perspective, leveraging a repair-based process to directly and interpretably capture the intrinsic differences between human-written and AI-generated text. Building on this perspective, we introduce DNA-DetectLLM, a zero-shot detection method for distinguishing AI-generated and human-written text. The method constructs an ideal AI-generated sequence for each input, iteratively repairs non-optimal tokens, and quantifies the cumulative repair effort as an interpretable detection signal. Empirical evaluations demonstrate that our method achieves state-of-the-art detection performance and exhibits strong robustness against various adversarial attacks and input lengths. Specifically, DNA-DetectLLM achieves relative improvements of 5.55% in AUROC and 2.08% in F1 score across multiple public benchmark datasets.

Problem

Research questions and friction points this paper is trying to address.

Detecting AI-generated text to combat misinformation

Distinguishing AI from human writing with interpretable signals

Improving robustness against adversarial attacks on text detection

Innovation

Methods, ideas, or system contributions that make the work stand out.

DNA-inspired mutation-repair paradigm

Zero-shot detection via iterative token repair

Quantifies cumulative repair effort as signal

🔎 Similar Papers

No similar papers found.

Authors to Follow