A Dataset of Low-Rated Applications from the Amazon Appstore for User Feedback Analysis

📅 2026-01-06
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the longstanding lack of systematic analysis of user feedback in low-rated mobile applications, which has hindered effective software quality improvement. To bridge this gap, the authors present and publicly release the first fine-grained annotated dataset specifically targeting low-rated apps, comprising 79,821 user reviews from 64 applications on the Amazon Appstore, with 6,000 reviews manually labeled into six distinct issue categories. This resource enables research on automated feedback classification and is particularly valuable for detecting complex phenomena such as missing features, sarcastic expressions, and nuanced emotional sentiments. By providing a structured and representative corpus, the dataset fills a critical void in the literature and serves as a foundational asset for machine learning–driven approaches to user feedback analysis and software evolution.

Technology Category

Application Category

📝 Abstract
In todays digital landscape, end-user feedback plays a crucial role in the evolution of software applications, particularly in addressing issues that hinder user experience. While much research has focused on high-rated applications, low-rated applications often remain unexplored, despite their potential to reveal valuable insights. This study introduces a novel dataset curated from 64 low-rated applications sourced from the Amazon Software Appstore (ASA), containing 79,821 user reviews. The dataset is designed to capture the most frequent issues identified by users, which are critical for improving software quality. To further enhance the dataset utility, a subset of 6000 reviews was manually annotated to classify them into six district issue categories: user interface (UI) and user experience (UX), functionality and features, compatibility and device specificity, performance and stability, customer support and responsiveness, and security and privacy issues. This annotated dataset is a valuable resource for developing machine learning-based approaches aiming to automate the classification of user feedback into various issue types. Making both the annotated and raw datasets publicly available provides researchers and developers with a crucial tool to understand common issues in low-rated apps and inform software improvements. The comprehensive analysis and availability of this dataset lay the groundwork for data-derived solutions to improve software quality based on user feedback. Additionally, the dataset can provide opportunities for software vendors and researchers to explore various software evolution-related activities, including frequently missing features, sarcasm, and associated emotions, which will help better understand the reasons for comparatively low app ratings.
Problem

Research questions and friction points this paper is trying to address.

low-rated applications
user feedback analysis
software quality
issue classification
Amazon Appstore
Innovation

Methods, ideas, or system contributions that make the work stand out.

low-rated applications
user feedback analysis
annotated dataset
issue categorization
software quality improvement
🔎 Similar Papers
No similar papers found.
Nek Dil Khan
Nek Dil Khan
Beijing University of Technology
NLPData MiningRequirement EngineeringDL &AI
Javed Ali Khan
Javed Ali Khan
University of Hertforshire, UK
Software EngineeringCrowdRERepositories MiningAI4SEHealth Analytics
D
Darvesh Khan
Department of Software Engineering, CECOS University of IT & Emerging Sciences, Peshawar, Khyber Pakhtunkhwa, Pakistan
Jianqiang Li
Jianqiang Li
Beijing University of Technology
M
Mumrez Khan
Faculty of Computer Science and Engineering, Xi’an University of Technology, Xian, 710121, China
S
Shah Fahad Khan
Computer Software Engineering Collage, Dalian University of Technology, Dalian 116000, China