Understand User Opinions of Large Language Models via LLM-Powered In-the-Moment User Experience Interviews

📅 2025-02-21

📈 Citations: 0

✨ Influential: 0

career value

188K/year

🤖 AI Summary

Traditional user studies for large language models (LLMs) suffer from low timeliness, coarse granularity, and limited scale, hindering acquisition of authentic, real-time subjective feedback. Method: We propose the “Interaction-as-Interview” paradigm, implemented via the CLUE system, which automatically generates and executes structured experience interviews immediately after users complete LLM conversations—enabling scalable, timely, fine-grained feedback collection. CLUE integrates LLM-driven dynamic interview generation, multi-turn dialogue understanding, opinion extraction, and structured log analysis. Contribution/Results: Our study uncovers critical user insights—including polarized perceptions of reasoning process transparency, acute anxiety regarding information recency, and strong demand for multimodal interaction. We further release the first high-quality, large-scale Chat-Interview Pair Dataset, comprising over 10,000 aligned conversation–interview pairs, to support research in LLM evaluation and human-AI interaction.

Technology Category

Application Category

📝 Abstract

Which large language model (LLM) is better? Every evaluation tells a story, but what do users really think about current LLMs? This paper presents CLUE, an LLM-powered interviewer that conducts in-the-moment user experience interviews, right after users interacted with LLMs, and automatically gathers insights about user opinions from massive interview logs. We conduct a study with thousands of users to understand user opinions on mainstream LLMs, recruiting users to first chat with a target LLM and then interviewed by CLUE. Our experiments demonstrate that CLUE captures interesting user opinions, for example, the bipolar views on the displayed reasoning process of DeepSeek-R1 and demands for information freshness and multi-modality. Our collected chat-and-interview logs will be released.

Problem

Research questions and friction points this paper is trying to address.

Evaluate user opinions on LLMs

Conduct real-time user experience interviews

Analyze insights from massive interview logs

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-powered interviewer

in-the-moment user experience

massive interview logs analysis

🔎 Similar Papers

AI Conversational Interviewing: Transforming Surveys with LLMs as Adaptive Interviewers