🤖 AI Summary
To address the prohibitively high computational cost and impractical deployment of directly applying large language models (LLMs) for modeling long historical sequences of bank customers, this paper proposes a lightweight contrastive learning framework. First, raw transaction and communication events are compressed into semantically distilled prompts; a frozen LLM then generates high-quality semantic supervision signals without fine-tuning. Second, contrastive learning aligns event-sequence embeddings—produced by an efficient encoder—with the LLM’s semantic embeddings, enabling effective customer representation learning. Crucially, the method avoids end-to-end LLM inference, substantially reducing latency and resource consumption. Evaluated on real-world financial datasets, our approach outperforms existing state-of-the-art methods on customer behavior prediction tasks while meeting stringent real-time system requirements—achieving a favorable balance of accuracy, efficiency, and practical deployability.
📝 Abstract
Learning clients embeddings from sequences of their historic communications is central to financial applications. While large language models (LLMs) offer general world knowledge, their direct use on long event sequences is computationally expensive and impractical in real-world pipelines. In this paper, we propose LATTE, a contrastive learning framework that aligns raw event embeddings with semantic embeddings from frozen LLMs. Behavioral features are summarized into short prompts, embedded by the LLM, and used as supervision via contrastive loss. The proposed approach significantly reduces inference cost and input size compared to conventional processing of complete sequence by LLM. We experimentally show that our method outperforms state-of-the-art techniques for learning event sequence representations on real-world financial datasets while remaining deployable in latency-sensitive environments.