CLASP: Cost-Optimized LLM-based Agentic System for Phishing Detection

📅 2025-10-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Phishing website detection faces a fundamental trade-off between accuracy and computational cost. This paper proposes a large language model (LLM)-based multi-agent collaborative detection framework that performs multimodal analysis of URL structure, webpage screenshots, and HTML content. Specialized agents—powered by Gemini 1.5 Flash and GPT-4o mini—process distinct modalities in parallel. A novel lightweight agent coordination mechanism ensures high detection accuracy while substantially reducing inference overhead. Evaluated on a newly constructed, publicly released benchmark dataset, the framework achieves an F1-score of 83.01%, with recall improved by over 40% and F1 increased by 20% relative to prior methods. Average processing time per site is 2.78 seconds, and API cost for 1,000 requests is only $3.18. The open-sourced dataset supports reproducible research in phishing detection.

Technology Category

Application Category

📝 Abstract
Phishing websites remain a significant cybersecurity threat, necessitating accurate and cost-effective detection mechanisms. In this paper, we present CLASP, a novel system that effectively identifies phishing websites by leveraging multiple intelligent agents, built using large language models (LLMs), to analyze different aspects of a web resource. The system processes URLs or QR codes, employing specialized LLM-based agents that evaluate the URL structure, webpage screenshot, and HTML content to predict potential phishing threats. To optimize performance while minimizing operational costs, we experimented with multiple combination strategies for agent-based analysis, ultimately designing a strategic combination that ensures the per-website evaluation expense remains minimal without compromising detection accuracy. We tested various LLMs, including Gemini 1.5 Flash and GPT-4o mini, to build these agents and found that Gemini 1.5 Flash achieved the best performance with an F1 score of 83.01% on a newly curated dataset. Also, the system maintained an average processing time of 2.78 seconds per website and an API cost of around $3.18 per 1,000 websites. Moreover, CLASP surpasses leading previous solutions, achieving over 40% higher recall and a 20% improvement in F1 score for phishing detection on the collected dataset. To support further research, we have made our dataset publicly available, supporting the development of more advanced phishing detection systems.
Problem

Research questions and friction points this paper is trying to address.

Detecting phishing websites accurately and cost-effectively
Optimizing agent-based analysis to minimize operational costs
Leveraging multiple LLM agents to analyze web resources
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses multiple LLM-based agents for phishing detection
Optimizes agent combinations to minimize operational costs
Evaluates URL structure, screenshots, and HTML content
🔎 Similar Papers
No similar papers found.