One Sentence, Two Embeddings: Contrastive Learning of Explicit and Implicit Semantic Representations

📅 2025-10-10

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Traditional sentence embedding methods rely on a single-vector representation paradigm, limiting their ability to simultaneously capture explicit semantics (e.g., literal meaning) and implicit semantics (e.g., inference, subtext), resulting in coarse-grained semantic modeling. To address this, we propose DualCSE, the first framework that jointly learns explicit and implicit dual-channel embeddings for each sentence: the explicit channel focuses on surface-level lexical matching, while the implicit channel models contextual inference relationships; both channels reside in a shared semantic space yet remain disentangled for flexible downstream use. DualCSE is built upon a contrastive learning framework with jointly optimized dual encoders. Extensive experiments demonstrate that DualCSE significantly outperforms state-of-the-art methods on diverse downstream tasks—including information retrieval and text classification—thereby enhancing fine-grained semantic representation and improving implicit semantic capture.

Technology Category

Application Category

📝 Abstract

Sentence embedding methods have made remarkable progress, yet they still struggle to capture the implicit semantics within sentences. This can be attributed to the inherent limitations of conventional sentence embedding methods that assign only a single vector per sentence. To overcome this limitation, we propose DualCSE, a sentence embedding method that assigns two embeddings to each sentence: one representing the explicit semantics and the other representing the implicit semantics. These embeddings coexist in the shared space, enabling the selection of the desired semantics for specific purposes such as information retrieval and text classification. Experimental results demonstrate that DualCSE can effectively encode both explicit and implicit meanings and improve the performance of the downstream task.

Problem

Research questions and friction points this paper is trying to address.

Capturing implicit semantics in sentence embeddings

Overcoming single-vector limitations in representation

Encoding both explicit and implicit semantic meanings

Innovation

Methods, ideas, or system contributions that make the work stand out.

DualCSE assigns two embeddings per sentence

Separates explicit and implicit semantic representations

Shared space enables selective semantic usage

🔎 Similar Papers

No similar papers found.

Authors to Follow