Conf-Profile: A Confidence-Driven Reasoning Paradigm for Label-Free User Profiling

๐Ÿ“… 2025-09-23
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
User profiling faces challenges including scarcity of ground-truth labels, highly heterogeneous and noisy data, and limited reliability of large language models (LLMs), compounded by the absence of standardized benchmarks. To address these issues, we propose Conf-Profile, a confidence-driven two-stage unsupervised user profiling framework. Methodologically, it introduces the first confidence-guided unsupervised reinforcement learning paradigm: LLMs generate synthetic labels with calibrated confidence scores; pseudo-label optimization is achieved via confidence-weighted voting, dynamic calibration, and knowledge distillation; and confidence-aware sample selection, reward weighting, and policy updates enable difficulty-adaptive learning. Evaluated on Qwen3-8B, Conf-Profile achieves a +13.97 F1-score improvement over baselines, demonstratingๆ˜พ่‘— enhanced robustness and generalization. This work establishes a scalable, label-free paradigm for high-quality user profiling.

Technology Category

Application Category

๐Ÿ“ Abstract
User profiling, as a core technique for user understanding, aims to infer structural attributes from user information. Large Language Models (LLMs) provide a promising avenue for user profiling, yet the progress is hindered by the lack of comprehensive benchmarks. To bridge this gap, we propose ProfileBench, an industrial benchmark derived from a real-world video platform, encompassing heterogeneous user data and a well-structured profiling taxonomy. However, the profiling task remains challenging due to the difficulty of collecting large-scale ground-truth labels, and the heterogeneous and noisy user information can compromise the reliability of LLMs. To approach label-free and reliable user profiling, we propose a Confidence-driven Profile reasoning framework Conf-Profile, featuring a two-stage paradigm. We first synthesize high-quality labels by leveraging advanced LLMs with confidence hints, followed by confidence-weighted voting for accuracy improvement and confidence calibration for a balanced distribution. The multiple profile results, rationales, and confidence scores are aggregated and distilled into a lightweight LLM. We further enhance the reasoning ability via confidence-guided unsupervised reinforcement learning, which exploits confidence for difficulty filtering, quasi-ground truth voting, and reward weighting. Experimental results demonstrate that Conf-Profile delivers substantial performance through the two-stage training, improving F1 by 13.97 on Qwen3-8B.
Problem

Research questions and friction points this paper is trying to address.

Addresses the lack of comprehensive benchmarks for user profiling with LLMs.
Solves the challenge of label scarcity and noisy user information for profiling.
Develops a confidence-driven framework for label-free, reliable user attribute inference.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Synthesizes labels using LLMs with confidence hints
Uses confidence-weighted voting and calibration techniques
Applies confidence-guided unsupervised reinforcement learning
๐Ÿ”Ž Similar Papers
2024-07-18arXiv.orgCitations: 1
Yingxin Li
Yingxin Li
Tsinghua University
LLMVLMEfficient ML
J
Jianbo Zhao
Douyin Content Group, Bytedance
X
Xueyu Ren
Douyin Content Group, Bytedance
Jie Tang
Jie Tang
UW Madison
Computed Tomography
W
Wangjie You
Douyin Content Group, Bytedance
X
Xu Chen
Douyin Content Group, Bytedance
Kan Zhou
Kan Zhou
Douyin Content Group, Bytedance
Chao Feng
Chao Feng
University of Zurich
networkmachine learningcybersecurity
J
Jiao Ran
Douyin Content Group, Bytedance
Y
Yuan Meng
Department of Computer Science and Technology, Tsinghua University
Z
Zhi Wang
Shenzhen International Graduate School, Tsinghua University