🤖 AI Summary
This work identifies a critical denial-of-service (DoS) vulnerability in large language models (LLMs), wherein repetitive generation of similar or identical outputs induces severe latency spikes. To address this, we propose RecurrentGenerator, the first black-box framework for discovering cyclic generation vulnerabilities via evolutionary search to automatically trigger recurrent behavior. We further design RecurrentDetector, a lightweight real-time detector that trains a binary classifier on neural activation patterns to achieve high-accuracy, low-overhead anomaly identification. Empirical evaluation on Llama-3 and GPT-4o confirms multiple classes of cyclic generation vulnerabilities. RecurrentDetector achieves 95.24% accuracy and an F1 score of 0.87. This is the first systematic study to expose the DoS risks inherent in LLM recurrent generation. To foster community defense research, we open-source both our detection tools and benchmark dataset.
📝 Abstract
Large Language Models (LLMs) have significantly advanced text understanding and generation, becoming integral to applications across education, software development, healthcare, entertainment, and legal services. Despite considerable progress in improving model reliability, latency remains under-explored, particularly through recurrent generation, where models repeatedly produce similar or identical outputs, causing increased latency and potential Denial-of-Service (DoS) vulnerabilities. We propose RecurrentGenerator, a black-box evolutionary algorithm that efficiently identifies recurrent generation scenarios in prominent LLMs like LLama-3 and GPT-4o. Additionally, we introduce RecurrentDetector, a lightweight real-time classifier trained on activation patterns, achieving 95.24% accuracy and an F1 score of 0.87 in detecting recurrent loops. Our methods provide practical solutions to mitigate latency-related vulnerabilities, and we publicly share our tools and data to support further research.