Using AI for User Representation: An Analysis of 83 Persona Prompts

📅 2025-08-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current LLM-based user persona generation suffers from superficial descriptions, format constraints (overreliance on textual/numerical outputs), inconsistent prompt design, and a lack of cross-model evaluation. Method: This study systematically analyzes 83 persona-generation prompts drawn from 27 prior works—the first cross-study prompt engineering synthesis—employing qualitative coding and quantitative statistics to examine output formats, attribute types, and prompting patterns. Contribution/Results: We find that over 50% of prompts enforce structured outputs (e.g., JSON), 74% utilize dynamic variable injection, yet >90% of studies employ only a few prompts and omit comparative LLM evaluations. The analysis reveals critical limitations in persona richness, multidimensionality, and methodological rigor. To address these, we propose three optimization pathways: (1) structured guidance for consistent output semantics, (2) dynamic contextual modeling for adaptive persona instantiation, and (3) multi-model collaborative evaluation for robust assessment. This work offers both methodological reflection and practical benchmarks for computational user representation.

Technology Category

Application Category

📝 Abstract
We analyzed 83 persona prompts from 27 research articles that used large language models (LLMs) to generate user personas. Findings show that the prompts predominantly generate single personas. Several prompts express a desire for short or concise persona descriptions, which deviates from the tradition of creating rich, informative, and rounded persona profiles. Text is the most common format for generated persona attributes, followed by numbers. Text and numbers are often generated together, and demographic attributes are included in nearly all generated personas. Researchers use up to 12 prompts in a single study, though most research uses a small number of prompts. Comparison and testing multiple LLMs is rare. More than half of the prompts require the persona output in a structured format, such as JSON, and 74% of the prompts insert data or dynamic variables. We discuss the implications of increased use of computational personas for user representation.
Problem

Research questions and friction points this paper is trying to address.

Analyzing predominance of single AI-generated user personas
Examining deviation from rich persona profiles tradition
Investigating structured format usage in persona prompts
Innovation

Methods, ideas, or system contributions that make the work stand out.

Analyzed 83 persona prompts using LLMs
Generated personas in structured JSON format
Inserted dynamic variables in 74% prompts
🔎 Similar Papers
No similar papers found.
Joni Salminen
Joni Salminen
Associate Professor (tenure track) at the University of Vaasa
PersonasTechnologyMarketing
D
Danial Amin
University of Vaasa, Vaasa, Finland
B
Bernard J. Jansen
Qatar Computing Research Institute, Hamad Bin Khalifa University, Doha, Qatar