LLM4ES: Learning User Embeddings from Event Sequences via Large Language Models

📅 2025-08-06

📈 Citations: 0

✨ Influential: 0

career value

189K/year

🤖 AI Summary

This paper addresses the challenges of low-quality user embeddings and poor domain adaptability in event sequence modeling. We propose a large language model (LLM)-based textual user representation learning framework. Our method first applies event sequence textual augmentation to transform heterogeneous events into semantically coherent natural language sequences. Then, it performs lightweight fine-tuning of the LLM via next-token prediction to explicitly capture temporal dependencies among events and underlying user behavioral patterns. Evaluated on cross-domain user classification tasks—including financial risk assessment and clinical prognosis prediction—our approach significantly outperforms conventional sequence embedding methods (e.g., RNNs, Transformers, Graph2Vec). Notably, it demonstrates superior robustness and generalizability under low-diversity data conditions. The framework offers a scalable, interpretable paradigm for high-value user modeling, bridging the gap between symbolic event semantics and deep sequence modeling.

Technology Category

Application Category

📝 Abstract

This paper presents LLM4ES, a novel framework that exploits large pre-trained language models (LLMs) to derive user embeddings from event sequences. Event sequences are transformed into a textual representation, which is subsequently used to fine-tune an LLM through next-token prediction to generate high-quality embeddings. We introduce a text enrichment technique that enhances LLM adaptation to event sequence data, improving representation quality for low-variability domains. Experimental results demonstrate that LLM4ES achieves state-of-the-art performance in user classification tasks in financial and other domains, outperforming existing embedding methods. The resulting user embeddings can be incorporated into a wide range of applications, from user segmentation in finance to patient outcome prediction in healthcare.

Problem

Research questions and friction points this paper is trying to address.

Derive user embeddings from event sequences using LLMs

Enhance LLM adaptation for low-variability event data

Improve user classification in finance and healthcare domains

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses LLMs for user embeddings from events

Transforms events to text for LLM fine-tuning

Enriches text to improve low-variability adaptation

🔎 Similar Papers

No similar papers found.