GuidaPA: Privacy-Preserving Chatbot for Public Administration via Federated Learning

📅 2026-05-31

📈 Citations: 0

✨ Influential: 0

career value

207K/year

🤖 AI Summary

This study addresses the challenge of building high-quality dialogue systems for Italian public administrations under stringent data privacy regulations by proposing a federated learning–based privacy-preserving approach. The method uniquely integrates parameter-efficient QLoRA (4-bit) fine-tuning with a non-IID monitoring mechanism tailored to the public sector, conducting 15 rounds of federated training on local documents from two national platforms. Role-based access control and client-side secure preprocessing ensure that sensitive data remains within its original domain. Experimental results demonstrate that the best-performing model achieves ROUGE-1/2/L scores of 61.10/55.77/59.44 and a BLEU-4 score of 45.02, representing improvements of approximately 21 and 23 points over generic baselines in ROUGE-1 and BLEU-4, respectively, while closely approximating the performance of centralized training—all without compromising data privacy.

📝 Abstract

We present GuidaPA, a privacy-preserving chatbot for the Italian Public Administration (PA) trained via Federated Learning (FL) on documentation from two national PA platforms, SIGESON and SIDFORS. Our corpus includes approximately 8 pages of SIGESON manuals and 31 pages of SIDFORS manuals/FAQs; while this study uses public documentation as a safe proxy, the intended deployment extends to restricted internal sources (e.g., tickets, officer manuals, database extracts) that can not be centrally pooled due to regulatory and organizational constraints. GuidaPA integrates role-based access control, secure client-side preprocessing, explicit monitoring of non-IID effects, and parameter-efficient federated fine-tuning of large language models. Using QLoRA (4-bit) over 15 federated rounds with an 80/20 train-test split per client, we evaluate answer quality with ROUGE, BLEU-4, and METEOR. The best federated model achieves ROUGE-1/2/L of 61.10/55.77/59.44, BLEU-4 of 45.02, and METEOR of 63.94-close to private centralized fine-tuning while keeping data on-site. Compared to the general-purpose baseline, domain fine-tuning improves ROUGE-1 from 41.45 to 62.18 and BLEU-4 from 26.97 to 50.90. Overall, the results indicate that FL can deliver high-quality conversational AI for public services without centralized data sharing

Problem

Research questions and friction points this paper is trying to address.

privacy-preserving

public administration

federated learning

chatbot

data decentralization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Federated Learning

Privacy-Preserving AI

QLoRA