AI-in-the-Loop: Privacy Preserving Real-Time Scam Detection and Conversational Scambaiting by Leveraging LLMs and Federated Learning

📅 2025-09-03

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Real-time detection and mitigation of social engineering attacks—such as phishing, impersonation, and vishing—pose significant challenges due to their dynamic, context-sensitive nature and stringent privacy requirements. Method: This paper proposes the first privacy-preserving AI-in-the-loop anti-fraud dialogue framework, integrating instruction-tuned large language models (LLMs), federated learning (FedAvg), and differential privacy to enable adjustable security thresholds and dynamic real-time moderation. A multi-layer safety mechanism is implemented via guardian models (e.g., LlamaGuard). Contribution/Results: It is the first work to jointly model real-time conversational intervention, distributed privacy protection, and adaptive security control. Experiments demonstrate fluent system responses (perplexity = 22.3), high user engagement (0.80), and strong privacy guarantees: PII leakage rate ≤ 0.0085 after 30 federated rounds. Both safety compliance and generative novelty remain stable under stringent privacy constraints.

Technology Category

Application Category

📝 Abstract

Scams exploiting real-time social engineering -- such as phishing, impersonation, and phone fraud -- remain a persistent and evolving threat across digital platforms. Existing defenses are largely reactive, offering limited protection during active interactions. We propose a privacy-preserving, AI-in-the-loop framework that proactively detects and disrupts scam conversations in real time. The system combines instruction-tuned artificial intelligence with a safety-aware utility function that balances engagement with harm minimization, and employs federated learning to enable continual model updates without raw data sharing. Experimental evaluations show that the system produces fluent and engaging responses (perplexity as low as 22.3, engagement $approx$0.80), while human studies confirm significant gains in realism, safety, and effectiveness over strong baselines. In federated settings, models trained with FedAvg sustain up to 30 rounds while preserving high engagement ($approx$0.80), strong relevance ($approx$0.74), and low PII leakage ($leq$0.0085). Even with differential privacy, novelty and safety remain stable, indicating that robust privacy can be achieved without sacrificing performance. The evaluation of guard models (LlamaGuard, LlamaGuard2/3, MD-Judge) shows a straightforward pattern: stricter moderation settings reduce the chance of exposing personal information, but they also limit how much the model engages in conversation. In contrast, more relaxed settings allow longer and richer interactions, which improve scam detection, but at the cost of higher privacy risk. To our knowledge, this is the first framework to unify real-time scam-baiting, federated privacy preservation, and calibrated safety moderation into a proactive defense paradigm.

Problem

Research questions and friction points this paper is trying to address.

Detects real-time scams like phishing and impersonation during active conversations

Balances engagement with privacy protection using federated learning

Proactively disrupts scams while minimizing personal data exposure

Innovation

Methods, ideas, or system contributions that make the work stand out.

AI-in-the-loop framework for real-time scam detection

Federated learning enables privacy-preserving model updates

Instruction-tuned AI balances engagement and safety

🔎 Similar Papers

No similar papers found.

Authors to Follow