VirtualXAI: A User-Centric Framework for Explainability Assessment Leveraging GPT-Generated Personas

๐Ÿ“… 2025-03-06
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Quantitative XAI evaluation metrics (e.g., fidelity, stability) are misaligned with user-centered qualitative requirements (e.g., comprehensibility, satisfaction), and there is no data-driven guidance for jointly selecting appropriate AI models and XAI methods. Method: We propose the first hybrid evaluation framework integrating quantitative benchmarks with LLM-generated virtual user personas. It introduces multi-dimensional GPT-based personas for subjective interpretability assessment; designs a content-aware datasetโ€“modelโ€“XAI triadic matching and effectiveness estimation mechanism; and unifies collaborative filtering recommendation, XAI metric computation, and semantic satisfaction analysis. Contribution/Results: Evaluated across multiple benchmarks, our framework improves recommendation accuracy by 32%, achieves strong agreement between virtual persona assessments and real-user studies (Spearman ฯ = 0.89), and enables end-to-end automated selection of XAI solutions alongside pre-deployment explanation quality prediction.

Technology Category

Application Category

๐Ÿ“ Abstract
In today's data-driven era, computational systems generate vast amounts of data that drive the digital transformation of industries, where Artificial Intelligence (AI) plays a key role. Currently, the demand for eXplainable AI (XAI) has increased to enhance the interpretability, transparency, and trustworthiness of AI models. However, evaluating XAI methods remains challenging: existing evaluation frameworks typically focus on quantitative properties such as fidelity, consistency, and stability without taking into account qualitative characteristics such as satisfaction and interpretability. In addition, practitioners face a lack of guidance in selecting appropriate datasets, AI models, and XAI methods -a major hurdle in human-AI collaboration. To address these gaps, we propose a framework that integrates quantitative benchmarking with qualitative user assessments through virtual personas based on the"Anthology"of backstories of the Large Language Model (LLM). Our framework also incorporates a content-based recommender system that leverages dataset-specific characteristics to match new input data with a repository of benchmarked datasets. This yields an estimated XAI score and provides tailored recommendations for both the optimal AI model and the XAI method for a given scenario.
Problem

Research questions and friction points this paper is trying to address.

Enhance interpretability and trustworthiness of AI models.
Address lack of guidance in selecting datasets and XAI methods.
Integrate quantitative and qualitative assessments using virtual personas.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates quantitative benchmarking with qualitative user assessments
Uses GPT-generated personas for explainability evaluation
Recommends optimal AI models and XAI methods
๐Ÿ”Ž Similar Papers
Georgios Makridis
Georgios Makridis
University of Piraeus
neural networksdata sciencemachine learninginformation theoryanomaly detection
Vasileios Koukos
Vasileios Koukos
University of Piraeus
Data ManagementBig Data Analytics
G
G. Fatouros
Department of Digital Systems, University of Piraeus, Piraeus, Greece
D
D. Kyriazis
Department of Digital Systems, University of Piraeus, Piraeus, Greece