🤖 AI Summary
This work addresses the insufficient semantic representation capability of posts in LinkedIn’s professional social networking context. We propose a domain-specific embedding generation method based on multi-task contrastive learning. Our approach fine-tunes a pre-trained Transformer model by jointly optimizing multiple semantic annotation tasks and incorporates a cross-lingual representation alignment mechanism, substantially improving generalization and transferability in retrieval and ranking tasks. To our knowledge, this is the first systematic application of multi-task contrastive learning for positive transfer in the professional social domain, supporting both zero-shot adaptation and multilingual deployment. Experiments demonstrate that our embeddings consistently outperform single-task baselines across all LinkedIn semantic tasks; achieve significant gains in zero-shot performance; exhibit enhanced robustness in multilingual settings; and surpass general-purpose embedding models—including those from OpenAI—on the LinkedIn dataset.
📝 Abstract
In enhancing LinkedIn core content recommendation models, a significant challenge lies in improving their semantic understanding capabilities. This paper addresses the problem by leveraging multi-task learning, a method that has shown promise in various domains. We fine-tune a pre-trained, transformer-based LLM using multi-task contrastive learning with data from a diverse set of semantic labeling tasks. We observe positive transfer, leading to superior performance across all tasks when compared to training independently on each. Our model outperforms the baseline on zero shot learning and offers improved multilingual support, highlighting its potential for broader application. The specialized content embeddings produced by our model outperform generalized embeddings offered by OpenAI on Linkedin dataset and tasks. This work provides a robust foundation for vertical teams across LinkedIn to customize and fine-tune the LLM to their specific applications. Our work offers insights and best practices for the field to build on.