Improved Content Understanding With Effective Use of Multi-task Contrastive Learning

📅 2024-05-18

🏛️ arXiv.org

📈 Citations: 2

✨ Influential: 0

career value

156K/year

🤖 AI Summary

This work addresses the insufficient semantic representation capability of posts in LinkedIn’s professional social networking context. We propose a domain-specific embedding generation method based on multi-task contrastive learning. Our approach fine-tunes a pre-trained Transformer model by jointly optimizing multiple semantic annotation tasks and incorporates a cross-lingual representation alignment mechanism, substantially improving generalization and transferability in retrieval and ranking tasks. To our knowledge, this is the first systematic application of multi-task contrastive learning for positive transfer in the professional social domain, supporting both zero-shot adaptation and multilingual deployment. Experiments demonstrate that our embeddings consistently outperform single-task baselines across all LinkedIn semantic tasks; achieve significant gains in zero-shot performance; exhibit enhanced robustness in multilingual settings; and surpass general-purpose embedding models—including those from OpenAI—on the LinkedIn dataset.

Technology Category

Application Category

📝 Abstract

In enhancing LinkedIn core content recommendation models, a significant challenge lies in improving their semantic understanding capabilities. This paper addresses the problem by leveraging multi-task learning, a method that has shown promise in various domains. We fine-tune a pre-trained, transformer-based LLM using multi-task contrastive learning with data from a diverse set of semantic labeling tasks. We observe positive transfer, leading to superior performance across all tasks when compared to training independently on each. Our model outperforms the baseline on zero shot learning and offers improved multilingual support, highlighting its potential for broader application. The specialized content embeddings produced by our model outperform generalized embeddings offered by OpenAI on Linkedin dataset and tasks. This work provides a robust foundation for vertical teams across LinkedIn to customize and fine-tune the LLM to their specific applications. Our work offers insights and best practices for the field to build on.

Problem

Research questions and friction points this paper is trying to address.

Fine-tunes LLMs for semantic post embeddings using multi-task learning

Improves retrieval and ranking performance in LinkedIn feed systems

Outperforms baseline models and OpenAI embeddings on LinkedIn tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-tuned transformer LLM with multi-task learning

Outperforms OpenAI embeddings on LinkedIn tasks

Deployed for near-line use within minutes

🔎 Similar Papers

No similar papers found.