GuidaPA: Privacy-Preserving Chatbot for Public Administration via Federated Learning

📅 2026-05-31
📈 Citations: 0
Influential: 0
📄 PDF

career value

207K/year
🤖 AI Summary
This study addresses the challenge of building high-quality dialogue systems for Italian public administrations under stringent data privacy regulations by proposing a federated learning–based privacy-preserving approach. The method uniquely integrates parameter-efficient QLoRA (4-bit) fine-tuning with a non-IID monitoring mechanism tailored to the public sector, conducting 15 rounds of federated training on local documents from two national platforms. Role-based access control and client-side secure preprocessing ensure that sensitive data remains within its original domain. Experimental results demonstrate that the best-performing model achieves ROUGE-1/2/L scores of 61.10/55.77/59.44 and a BLEU-4 score of 45.02, representing improvements of approximately 21 and 23 points over generic baselines in ROUGE-1 and BLEU-4, respectively, while closely approximating the performance of centralized training—all without compromising data privacy.
📝 Abstract
We present GuidaPA, a privacy-preserving chatbot for the Italian Public Administration (PA) trained via Federated Learning (FL) on documentation from two national PA platforms, SIGESON and SIDFORS. Our corpus includes approximately 8 pages of SIGESON manuals and 31 pages of SIDFORS manuals/FAQs; while this study uses public documentation as a safe proxy, the intended deployment extends to restricted internal sources (e.g., tickets, officer manuals, database extracts) that can not be centrally pooled due to regulatory and organizational constraints. GuidaPA integrates role-based access control, secure client-side preprocessing, explicit monitoring of non-IID effects, and parameter-efficient federated fine-tuning of large language models. Using QLoRA (4-bit) over 15 federated rounds with an 80/20 train-test split per client, we evaluate answer quality with ROUGE, BLEU-4, and METEOR. The best federated model achieves ROUGE-1/2/L of 61.10/55.77/59.44, BLEU-4 of 45.02, and METEOR of 63.94-close to private centralized fine-tuning while keeping data on-site. Compared to the general-purpose baseline, domain fine-tuning improves ROUGE-1 from 41.45 to 62.18 and BLEU-4 from 26.97 to 50.90. Overall, the results indicate that FL can deliver high-quality conversational AI for public services without centralized data sharing
Problem

Research questions and friction points this paper is trying to address.

privacy-preserving
public administration
federated learning
chatbot
data decentralization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Federated Learning
Privacy-Preserving AI
QLoRA
Public Administration Chatbot
Non-IID Monitoring
🔎 Similar Papers