Agent for User: Testing Multi-User Interactive Features in TikTok

📅 2025-04-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Testing multi-user interactive features—such as live streaming and voice calls—in social apps (e.g., TikTok) is challenging due to the need for synchronized multi-device coordination, role-aware behavior modeling, and real-time interaction simulation. To address this, we propose the first LLM-driven multi-agent collaborative testing paradigm tailored for multi-user interaction scenarios. Within a virtual device farm, we deploy dedicated LLM agents for each user role, enabling cross-device, role-specific behavioral modeling and task orchestration. Our approach integrates action-sequence modeling with cross-device coordinated control. Evaluated on 24 multi-user tasks, it achieves 75% task coverage and 85.9% action similarity, reducing manual testing effort by 87%. Deployed on TikTok’s internal testing platform, the method uncovered 26 interaction-related defects.

Technology Category

Application Category

📝 Abstract
TikTok, a widely-used social media app boasting over a billion monthly active users, requires effective app quality assurance for its intricate features. Feature testing is crucial in achieving this goal. However, the multi-user interactive features within the app, such as live streaming, voice calls, etc., pose significant challenges for developers, who must handle simultaneous device management and user interaction coordination. To address this, we introduce a novel multi-agent approach, powered by the Large Language Models (LLMs), to automate the testing of multi-user interactive app features. In detail, we build a virtual device farm that allocates the necessary number of devices for a given multi-user interactive task. For each device, we deploy an LLM-based agent that simulates a user, thereby mimicking user interactions to collaboratively automate the testing process. The evaluations on 24 multi-user interactive tasks within the TikTok app, showcase its capability to cover 75% of tasks with 85.9% action similarity and offer 87% time savings for developers. Additionally, we have also integrated our approach into the real-world TikTok testing platform, aiding in the detection of 26 multi-user interactive bugs.
Problem

Research questions and friction points this paper is trying to address.

Automating testing for TikTok's multi-user interactive features
Managing simultaneous device and user interaction coordination
Detecting bugs in live streaming and voice call features
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-based agents simulate user interactions
Virtual device farm manages multiple devices
Automates testing with high action similarity
🔎 Similar Papers
No similar papers found.
Sidong Feng
Sidong Feng
PhD Student, Monash University
Human Computer InteractionSoftware EngineeringDeep Learning
C
Changhao Du
Jilin University, China
H
Huaxiao Liu
Jilin University, China
Q
Qingnan Wang
Jilin University, China
Z
Zhengwei Lv
Bytedance, China
G
Gang Huo
Bytedance, China
X
Xu Yang
Bytedance, China
Chunyang Chen
Chunyang Chen
Professor at Department of Computer Science, Technical University of Munich
Software EngineeringDeep LearningHuman Computer InteractionLLM4SEGUI