đ¤ AI Summary
This study investigates whether large language models (LLMs) can effectively translate the internal mechanisms of mathematically interpretable recommendation modelsâsuch as constrained matrix factorizationâinto natural-language explanations comprehensible to end users.
Method: We propose a prompt-driven framework that encodes structured informationâincluding user type, rating logic, and model constraintsâinto LLM inputs to generate diverse, semantically coherent explanations. Adopting a user-centered design, we conduct an empirical study with 326 participants, evaluating explanations across transparency, persuasiveness, and trust using multi-dimensional human assessments to overcome limitations of automated metrics.
Contribution/Results: We provide the first systematic validation of LLMsâ efficacy in bridging *mathematical interpretability* and *user-understandable explanations*. We identify significant perceptual differences across explanation strategies and demonstrate that our approach substantially improves usersâ comprehension of and trust in recommendationsâestablishing a new paradigm for explainable recommendation that balances theoretical rigor with practical deployability.
đ Abstract
We investigate whether large language models (LLMs) can generate effective, user-facing explanations from a mathematically interpretable recommendation model. The model is based on constrained matrix factorization, where user types are explicitly represented and predicted item scores share the same scale as observed ratings, making the model's internal representations and predicted scores directly interpretable. This structure is translated into natural language explanations using carefully designed LLM prompts. Many works in explainable AI rely on automatic evaluation metrics, which often fail to capture users' actual needs and perceptions. In contrast, we adopt a user-centered approach: we conduct a study with 326 participants who assessed the quality of the explanations across five key dimensions-transparency, effectiveness, persuasion, trust, and satisfaction-as well as the recommendations themselves.To evaluate how different explanation strategies are perceived, we generate multiple explanation types from the same underlying model, varying the input information provided to the LLM. Our analysis reveals that all explanation types are generally well received, with moderate statistical differences between strategies. User comments further underscore how participants react to each type of explanation, offering complementary insights beyond the quantitative results.