GroupGPT: A Token-efficient and Privacy-preserving Agentic Framework for Multi-User Chat Assistant

📅 2026-03-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing chat assistants struggle to deliver efficient, accurate, and privacy-preserving proactive interventions in multi-user group chats. This work proposes GroupGPT, a novel framework featuring a collaborative architecture between small and large language models that decouples intervention timing detection from response generation. GroupGPT supports multimodal inputs and performs privacy-preserving data sanitization locally before uploading to the cloud. To facilitate research in this domain, the authors introduce MUIR, the first benchmark dataset for group chat intervention reasoning. Experimental results on MUIR show that GroupGPT achieves a high LLM-assigned score of 4.72 out of 5.0, demonstrates strong user satisfaction, and reduces token consumption by up to threefold compared to baseline methods—effectively balancing timeliness, low computational overhead, and robust privacy protection.

Technology Category

Application Category

📝 Abstract
Recent advances in large language models (LLMs) have enabled increasingly capable chatbots. However, most existing systems focus on single-user settings and do not generalize well to multi-user group chats, where agents require more proactive and accurate intervention under complex, evolving contexts. Existing approaches typically rely on LLMs for both reasoning and generation, leading to high token consumption, limited scalability, and potential privacy risks. To address these challenges, we propose GroupGPT, a token-efficient and privacy-preserving agentic framework for multi-user chat assistant. GroupGPT adopts a small-large model collaborative architecture to decouple intervention timing from response generation, enabling efficient and accurate decision-making. The framework also supports multimodal inputs, including memes, images, videos, and voice messages. We further introduce MUIR, a benchmark dataset for multi-user chat assistant intervention reasoning. MUIR contains 2,500 annotated group chat segments with intervention labels and rationales, supporting evaluation of timing accuracy and response quality. We evaluate a range of models on MUIR, from large language models to smaller counterparts. Extensive experiments demonstrate that GroupGPT produces accurate and well-timed responses, achieving an average score of 4.72/5.0 in LLM-based evaluation, and is well received by users across diverse group chat scenarios. Moreover, GroupGPT reduces token usage by up to 3 times compared to baseline methods, while providing privacy sanitization of user messages before cloud transmission. Code is available at: https://github.com/Eliot-Shen/GroupGPT .
Problem

Research questions and friction points this paper is trying to address.

multi-user chat
token efficiency
privacy preservation
intervention reasoning
group chat assistant
Innovation

Methods, ideas, or system contributions that make the work stand out.

token-efficient
privacy-preserving
multi-user chat
small-large model collaboration
intervention reasoning
🔎 Similar Papers
No similar papers found.
Z
Zhuokang Shen
East China Normal University
Yifan Wang
Yifan Wang
Southeast University
Pattern RecognitionImage processingFinger vein recognitionFeature template protection
H
Hanyu Chen
University of Nottingham Ningbo
Wenxuan Huang
Wenxuan Huang
CUHK & ECNU
Artificial General IntelligenceMLLMLLMAIGCModel Acceleration
S
Shaohui Lin
East China Normal University; Sanming University