🤖 AI Summary
This work addresses the challenge of extracting improvement suggestions from unstructured customer reviews, where mixed intents often hinder accurate identification. To tackle this, the authors propose a hybrid reasoning framework that integrates supervised learning with large language models (LLMs). The approach combines a high-recall RoBERTa classifier, an instruction-tuned LLM, rule-based filtering, and clustering algorithms to enable precise extraction, classification, clustering, and summarization of suggestions. Experimental results on real-world hotel and restaurant review datasets demonstrate that the system significantly outperforms baseline methods—including prompt-only, rule-based, and single-classifier approaches—in both extraction accuracy and cluster coherence. Human evaluations further confirm that the generated outputs exhibit high clarity, faithfulness, and interpretability, effectively balancing recall and precision.
📝 Abstract
Extracting actionable suggestions from customer reviews is essential for operational decision-making, yet these directives are often embedded within mixed-intent, unstructured text. Existing approaches either classify suggestion-bearing sentences or generate high-level summaries, but rarely isolate the precise improvement instructions businesses need. We evaluate a hybrid pipeline combining a high-recall RoBERTa classifier trained with a precision-recall surrogate to reduce unrecoverable false negatives with a controlled, instruction-tuned LLM for suggestion extraction, categorization, clustering, and summarization. Across real-world hospitality and food datasets, the hybrid system outperforms prompt-only, rule-based, and classifier-only baselines in extraction accuracy and cluster coherence. Human evaluations further confirm that the resulting suggestions and summaries are clear, faithful, and interpretable. Overall, our results show that hybrid reasoning architectures achieve meaningful improvements fine-grained actionable suggestion mining while highlighting challenges in domain adaptation and efficient local deployment.