Transforming Chatbot Text: A Sequence-to-Sequence Approach

📅 2025-06-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
AI-generated text is increasingly vulnerable to detection by automated classifiers, posing challenges for both adversarial evasion and content governance. Method: This paper proposes a sequence-to-sequence–based adversarial humanization rewriting framework. Diverging from conventional fine-tuning or prompt engineering, it pioneers the use of T5-small and BART architectures for transferable, semantics-preserving, end-to-end humanization—integrating linguistics-inspired syntactic restructuring and stylistic modulation. Contribution/Results: Empirical evaluation shows that rewritten texts reduce the accuracy of mainstream GPT detectors by an average of 28.6%. Furthermore, classifiers trained on augmented datasets containing rewritten samples achieve 94.3% accuracy in identifying the original AI-generated texts, confirming high semantic fidelity and attack reversibility. The approach thus simultaneously advances adversarial robustness and detection resilience, offering a novel paradigm for governing AI-generated content.

Technology Category

Application Category

📝 Abstract
Due to advances in Large Language Models (LLMs) such as ChatGPT, the boundary between human-written text and AI-generated text has become blurred. Nevertheless, recent work has demonstrated that it is possible to reliably detect GPT-generated text. In this paper, we adopt a novel strategy to adversarially transform GPT-generated text using sequence-to-sequence (Seq2Seq) models, with the goal of making the text more human-like. We experiment with the Seq2Seq models T5-small and BART which serve to modify GPT-generated sentences to include linguistic, structural, and semantic components that may be more typical of human-authored text. Experiments show that classification models trained to distinguish GPT-generated text are significantly less accurate when tested on text that has been modified by these Seq2Seq models. However, after retraining classification models on data generated by our Seq2Seq technique, the models are able to distinguish the transformed GPT-generated text from human-generated text with high accuracy. This work adds to the accumulating knowledge of text transformation as a tool for both attack -- in the sense of defeating classification models -- and defense -- in the sense of improved classifiers -- thereby advancing our understanding of AI-generated text.
Problem

Research questions and friction points this paper is trying to address.

Transform GPT-generated text to appear more human-like
Improve detection resistance against AI-text classifiers
Enhance classifier accuracy for transformed AI-generated text
Innovation

Methods, ideas, or system contributions that make the work stand out.

Seq2Seq models transform GPT-generated text
T5-small and BART enhance human-like features
Retraining classifiers improves detection accuracy
🔎 Similar Papers
No similar papers found.