Conf-Profile: A Confidence-Driven Reasoning Paradigm for Label-Free User Profiling

📅 2025-09-23

📈 Citations: 0

✨ Influential: 0

career value

188K/year

🤖 AI Summary

User profiling faces challenges including scarcity of ground-truth labels, highly heterogeneous and noisy data, and limited reliability of large language models (LLMs), compounded by the absence of standardized benchmarks. To address these issues, we propose Conf-Profile, a confidence-driven two-stage unsupervised user profiling framework. Methodologically, it introduces the first confidence-guided unsupervised reinforcement learning paradigm: LLMs generate synthetic labels with calibrated confidence scores; pseudo-label optimization is achieved via confidence-weighted voting, dynamic calibration, and knowledge distillation; and confidence-aware sample selection, reward weighting, and policy updates enable difficulty-adaptive learning. Evaluated on Qwen3-8B, Conf-Profile achieves a +13.97 F1-score improvement over baselines, demonstrating显著 enhanced robustness and generalization. This work establishes a scalable, label-free paradigm for high-quality user profiling.

Technology Category

Application Category

📝 Abstract

User profiling, as a core technique for user understanding, aims to infer structural attributes from user information. Large Language Models (LLMs) provide a promising avenue for user profiling, yet the progress is hindered by the lack of comprehensive benchmarks. To bridge this gap, we propose ProfileBench, an industrial benchmark derived from a real-world video platform, encompassing heterogeneous user data and a well-structured profiling taxonomy. However, the profiling task remains challenging due to the difficulty of collecting large-scale ground-truth labels, and the heterogeneous and noisy user information can compromise the reliability of LLMs. To approach label-free and reliable user profiling, we propose a Confidence-driven Profile reasoning framework Conf-Profile, featuring a two-stage paradigm. We first synthesize high-quality labels by leveraging advanced LLMs with confidence hints, followed by confidence-weighted voting for accuracy improvement and confidence calibration for a balanced distribution. The multiple profile results, rationales, and confidence scores are aggregated and distilled into a lightweight LLM. We further enhance the reasoning ability via confidence-guided unsupervised reinforcement learning, which exploits confidence for difficulty filtering, quasi-ground truth voting, and reward weighting. Experimental results demonstrate that Conf-Profile delivers substantial performance through the two-stage training, improving F1 by 13.97 on Qwen3-8B.

Problem

Research questions and friction points this paper is trying to address.

Addresses the lack of comprehensive benchmarks for user profiling with LLMs.

Solves the challenge of label scarcity and noisy user information for profiling.

Develops a confidence-driven framework for label-free, reliable user attribute inference.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Synthesizes labels using LLMs with confidence hints

Uses confidence-weighted voting and calibration techniques

Applies confidence-guided unsupervised reinforcement learning

🔎 Similar Papers

Unmasking Social Bots: How Confident Are We?