One Sentence, Two Embeddings: Contrastive Learning of Explicit and Implicit Semantic Representations

📅 2025-10-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Traditional sentence embedding methods rely on a single-vector representation paradigm, limiting their ability to simultaneously capture explicit semantics (e.g., literal meaning) and implicit semantics (e.g., inference, subtext), resulting in coarse-grained semantic modeling. To address this, we propose DualCSE, the first framework that jointly learns explicit and implicit dual-channel embeddings for each sentence: the explicit channel focuses on surface-level lexical matching, while the implicit channel models contextual inference relationships; both channels reside in a shared semantic space yet remain disentangled for flexible downstream use. DualCSE is built upon a contrastive learning framework with jointly optimized dual encoders. Extensive experiments demonstrate that DualCSE significantly outperforms state-of-the-art methods on diverse downstream tasks—including information retrieval and text classification—thereby enhancing fine-grained semantic representation and improving implicit semantic capture.

Technology Category

Application Category

📝 Abstract
Sentence embedding methods have made remarkable progress, yet they still struggle to capture the implicit semantics within sentences. This can be attributed to the inherent limitations of conventional sentence embedding methods that assign only a single vector per sentence. To overcome this limitation, we propose DualCSE, a sentence embedding method that assigns two embeddings to each sentence: one representing the explicit semantics and the other representing the implicit semantics. These embeddings coexist in the shared space, enabling the selection of the desired semantics for specific purposes such as information retrieval and text classification. Experimental results demonstrate that DualCSE can effectively encode both explicit and implicit meanings and improve the performance of the downstream task.
Problem

Research questions and friction points this paper is trying to address.

Capturing implicit semantics in sentence embeddings
Overcoming single-vector limitations in representation
Encoding both explicit and implicit semantic meanings
Innovation

Methods, ideas, or system contributions that make the work stand out.

DualCSE assigns two embeddings per sentence
Separates explicit and implicit semantic representations
Shared space enables selective semantic usage
🔎 Similar Papers
No similar papers found.
K
Kohei Oda
Japan Advanced Institute of Science and Technology
P
Po-Min Chuang
Toshiba Corporation
K
Kiyoaki Shirai
Japan Advanced Institute of Science and Technology
Natthawut Kertkeidkachorn
Natthawut Kertkeidkachorn
Japan Advanced Institute of Science and Technology
Knowledge GraphNatural Language ProcessingMachine Learning