🤖 AI Summary
Existing approaches to extracting Customer Needs (CNs) from unstructured textual data (e.g., user interviews, reviews) rely heavily on costly manual annotation, suffer from poor interpretability, and lack traceability. Method: We propose an end-to-end framework integrating supervised fine-tuning (SFT) of large language models (LLMs), domain-expert-annotated training data, structured prompt engineering, and collaborative evaluation with professional consulting firms. Contribution/Results: For the first time, blinded empirical evaluation demonstrates that our fine-tuned LLM matches or exceeds senior human analysts across four key dimensions—accuracy, specificity, traceability, and coverage completeness—while exhibiting no hallucination, strong business adaptability, and high interpretability. The method significantly improves both CN extraction efficiency and precision of innovation insights, establishing a reproducible, verifiable paradigm for automating requirements engineering.
📝 Abstract
Identifying customer needs (CNs) is important for product management, product development, and marketing. Applications rely on professional analysts interpreting textual data (e.g., interview transcripts, online reviews) to understand the nuances of customer experience and concisely formulate"jobs to be done."The task is cognitively complex and time-consuming. Current practice facilitates the process with keyword search and machine learning but relies on human judgment to formulate CNs. We examine whether Large Language Models (LLMs) can automatically extract CNs. Because evaluating CNs requires professional judgment, we partnered with a marketing consulting firm to conduct a blind study of CNs extracted by: (1) a foundational LLM with prompt engineering only (Base LLM), (2) an LLM fine-tuned with professionally identified CNs (SFT LLM), and (3) professional analysts. The SFT LLM performs as well as or better than professional analysts when extracting CNs. The extracted CNs are well-formulated, sufficiently specific to identify opportunities, and justified by source content (no hallucinations). The SFT LLM is efficient and provides more complete coverage of CNs. The Base LLM was not sufficiently accurate or specific. Organizations can rely on SFT LLMs to reduce manual effort, enhance the precision of CN articulation, and provide improved insight for innovation and marketing strategy.