Understanding the Challenges and Promises of Developing Generative AI Apps: An Empirical Study

📅 2025-06-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates real-user perceptions, challenges, and expectations regarding functional dimensions—AI performance, content quality, and policy moderation—in generative AI applications (Gen-AI apps). Method: Leveraging 676,000 user reviews from 173 Gen-AI apps on Google Play, we propose the SARA framework (Screening, Acquisition, Refinement, Analysis) and introduce a novel LLM-driven few-shot (5-shot) thematic extraction method, achieving 91% precision in identifying themes from unstructured text. Temporal trend modeling and empirical analysis uncover dynamic shifts in user concerns and generational differences. Contribution/Results: We identify the ten most frequent thematic issues and their evolving sentiment trajectories. Based on findings, we derive 12 actionable, evidence-based recommendations spanning functional optimization, ethical governance, and human-AI interaction design—providing practitioners and researchers with a data-driven roadmap for responsible Gen-AI development.

Technology Category

Application Category

📝 Abstract
The release of ChatGPT in 2022 triggered a rapid surge in generative artificial intelligence mobile apps (i.e., Gen-AI apps). Despite widespread adoption, little is known about how end users perceive and evaluate these Gen-AI functionalities in practice. In this work, we conduct a user-centered analysis of 676,066 reviews from 173 Gen-AI apps on the Google Play Store. We introduce a four-phase methodology, SARA (Selection, Acquisition, Refinement, and Analysis), that enables the systematic extraction of user insights using prompt-based LLM techniques. First, we demonstrate the reliability of LLMs in topic extraction, achieving 91% accuracy through five-shot prompting and non-informative review filtering. Then, we apply this method to the informative reviews, identify the top 10 user-discussed topics (e.g., AI Performance, Content Quality, and Content Policy&Censorship) and analyze the key challenges and emerging opportunities. Finally, we examine how these topics evolve over time, offering insight into shifting user expectations and engagement patterns with Gen-AI apps. Based on our findings and observations, we present actionable implications for developers and researchers.
Problem

Research questions and friction points this paper is trying to address.

Analyzing user perceptions of Gen-AI app functionalities
Identifying key challenges in Gen-AI app performance and content
Tracking evolution of user expectations in Gen-AI apps
Innovation

Methods, ideas, or system contributions that make the work stand out.

User-centered analysis of 676K app reviews
Four-phase SARA methodology for insights
LLM topic extraction with 91% accuracy
🔎 Similar Papers
No similar papers found.
B
Buthayna AlMulla
University of Toronto, Canada
M
Maram Assi
Université du Québec à Montréal, Canada
Safwat Hassan
Safwat Hassan
Assistant professor at University of Toronto
Software AnalyticsMining Software RepositoriesEmpirical Software EngineeringAndroidDevOps