How Sampling Affects the Detectability of Machine-written texts: A Comprehensive Study

📅 2025-10-15

📈 Citations: 0

✨ Influential: 0

career value

178K/year

🤖 AI Summary

Existing large language model (LLM) text detectors exhibit insufficient robustness across diverse decoding strategies, undermining their practical reliability. Method: This work systematically investigates how sampling-based decoding—specifically temperature scaling, top-p (nucleus) sampling, and related variants—affects text detectability, focusing on the degradation mechanisms induced by (sub)word-level distributional perturbations. We construct a large-scale benchmark comprising 37 distinct decoding configurations and comprehensively evaluate state-of-the-art detectors using AUROC as the primary metric. Contribution/Results: We find that minor adjustments to decoding parameters can reduce AUROC from near 100% to as low as 1%, exposing detectors’ extreme sensitivity to generation settings. The study identifies fundamental flaws in current evaluation paradigms and advocates for decoder-agnostic, robust detection frameworks alongside standardized evaluation protocols for decoding diversity. To foster reproducibility and community advancement, we publicly release all data and code.

Technology Category

Application Category

📝 Abstract

As texts generated by Large Language Models (LLMs) are ever more common and often indistinguishable from human-written content, research on automatic text detection has attracted growing attention. Many recent detectors report near-perfect accuracy, often boasting AUROC scores above 99%. However, these claims typically assume fixed generation settings, leaving open the question of how robust such systems are to changes in decoding strategies. In this work, we systematically examine how sampling-based decoding impacts detectability, with a focus on how subtle variations in a model's (sub)word-level distribution affect detection performance. We find that even minor adjustments to decoding parameters - such as temperature, top-p, or nucleus sampling - can severely impair detector accuracy, with AUROC dropping from near-perfect levels to 1% in some settings. Our findings expose critical blind spots in current detection methods and emphasize the need for more comprehensive evaluation protocols. To facilitate future research, we release a large-scale dataset encompassing 37 decoding configurations, along with our code and evaluation framework https://github.com/BaggerOfWords/Sampling-and-Detection

Problem

Research questions and friction points this paper is trying to address.

Examining how decoding strategies affect machine text detectability

Assessing robustness of detectors to sampling parameter variations

Investigating performance degradation in detection with decoding changes

Innovation

Methods, ideas, or system contributions that make the work stand out.

Systematically evaluates sampling decoding impact on detectability

Reveals decoding parameter adjustments severely impair detector accuracy

Provides large-scale dataset with 37 decoding configurations

🔎 Similar Papers

Detecting AI-Generated Text: Factors Influencing Detectability with Current Methods