🤖 AI Summary
This study addresses the limitations of traditional manual user profiling—high cost and poor scalability—and the unreliability and lack of iterative refinement in existing single-pass large language model (LLM)-based approaches. To overcome these challenges, the authors propose PerGent, a novel method that introduces, for the first time in an industrial setting, a multi-agent collaborative framework comprising three LLM-based agents: a generator, a critic, and a coordinator. By integrating structured and unstructured external data sources such as interviews and surveys, PerGent enables multiple rounds of critique-and-refinement iterations to progressively enhance profile quality. Evaluated in a real-world deployment at Kinaxis, the method achieved a 96.9% expert approval rate, significantly outperforming all baseline methods by not only accurately reproducing expert-derived content but also generating substantial high-value supplementary insights.
📝 Abstract
Personas are widely used in software engineering to support requirements elicitation, design, and validation, but their manual creation is costly, time-consuming, and hard to scale. Recent LLM-based approaches automate persona generation from textual data; however, they typically rely on single-shot generation and subjective evaluations, limiting practical reliability. We present PerGent, an industry-grade method for persona generation built around an iterative critique-refinement loop. Specifically, PerGent uses a generator and a critic LLM agent, coordinated by an orchestrator, to iteratively refine personas using external resources such as interviews, surveys, and job postings through a critique-refinement loop with a user-defined maximum number of rounds. We deploy and evaluate PerGent in an industrial setting at Kinaxis, comparing it with three baselines, including one-shot methods. In an expert in-situ evaluation, PerGent achieved the highest expert approval rate (96.9%), exceeding all baselines. We further compare PerGent-generated personas with best-practice personas manually created by domain experts prior to the adoption of LLMs. Compared to baselines, PerGent reproduces a larger proportion of expert content while also contributing substantial new content beyond the pre-LLM personas. We conclude with lessons learned from deploying and evaluating PerGent at Kinaxis.