🤖 AI Summary
To address weak topic coherence, poor interpretability, and low star-rating prediction accuracy in customer feedback analysis, this paper proposes an Opinion Unit–based reconstruction paradigm: decomposing reviews into minimal semantic units comprising text snippets paired with corresponding sentiment scores, and systematically adopting them as the fundamental input for topic modeling—first introduced in this work. Methodologically, we integrate large language model–driven opinion unit extraction, interpretable topic modeling, fine-grained sentiment analysis, and multimodal feature fusion into an end-to-end analytical framework. Experiments demonstrate significant improvements: +12.3% NPMI in topic coherence, +5.8% accuracy in star-rating prediction, and enhanced cross-domain business insight generation. Our core contribution is establishing the opinion unit as a novel analytical primitive that unifies topic structure, sentiment polarity, and business metrics.
📝 Abstract
We improve the extraction of insights from customer reviews by restructuring the topic modelling pipeline to operate on opinion units - distinct statements that include relevant text excerpts and associated sentiment scores. Prior work has demonstrated that such units can be reliably extracted using large language models. The result is a heightened performance of the subsequent topic modeling, leading to coherent and interpretable topics while also capturing the sentiment associated with each topic. By correlating the topics and sentiments with business metrics, such as star ratings, we can gain insights on how specific customer concerns impact business outcomes. We present our system's implementation, use cases, and advantages over other topic modeling and classification solutions. We also evaluate its effectiveness in creating coherent topics and assess methods for integrating topic and sentiment modalities for accurate star-rating prediction.