CASE -- Condition-Aware Sentence Embeddings for Conditional Semantic Textual Similarity Measurement

📅 2025-03-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing sentence embedding methods for Conditional Semantic Text Similarity (C-STS) lack contextual sensitivity—i.e., they fail to adequately adapt sentence representations to the given condition. Method: We propose a conditional-aware sentence embedding framework featuring (i) bidirectional information flow, where sentence-level features are explicitly injected into the attention-based pooling of condition encodings to enable fine-grained condition–sentence interaction; and (ii) supervised nonlinear dimensionality reduction (e.g., MLP-based projection) that jointly compresses embeddings and enhances semantic discriminability. Our approach integrates LLM-derived condition encoding, attention-guided interactive pooling, and learnable dimensionality reduction—avoiding heuristic operations like naive embedding subtraction. Contribution/Results: The method achieves state-of-the-art performance on standard C-STS benchmarks. Ablation studies confirm that the proposed dimensionality reduction not only improves computational efficiency but also retroactively refines the similarity modeling capability of LLM embeddings—demonstrating synergistic optimization of both accuracy and efficiency.

Technology Category

Application Category

📝 Abstract
The meaning conveyed by a sentence often depends on the context in which it appears. Despite the progress of sentence embedding methods, it remains unclear how to best modify a sentence embedding conditioned on its context. To address this problem, we propose Condition-Aware Sentence Embeddings (CASE), an efficient and accurate method to create an embedding for a sentence under a given condition. First, CASE creates an embedding for the condition using a Large Language Model (LLM), where the sentence influences the attention scores computed for the tokens in the condition during pooling. Next, a supervised nonlinear projection is learned to reduce the dimensionality of the LLM-based text embeddings. We show that CASE significantly outperforms previously proposed Conditional Semantic Textual Similarity (C-STS) methods on an existing standard benchmark dataset. We find that subtracting the condition embedding consistently improves the C-STS performance of LLM-based text embeddings. Moreover, we propose a supervised dimensionality reduction method that not only reduces the dimensionality of LLM-based embeddings but also significantly improves their performance.
Problem

Research questions and friction points this paper is trying to address.

Modify sentence embeddings based on contextual conditions
Improve Conditional Semantic Textual Similarity (C-STS) measurement
Optimize dimensionality reduction for LLM-based embeddings
Innovation

Methods, ideas, or system contributions that make the work stand out.

Condition-aware sentence embeddings using LLM
Supervised nonlinear projection for dimensionality reduction
Subtracting condition embedding improves performance
🔎 Similar Papers
No similar papers found.
Gaifan Zhang
Gaifan Zhang
Columbia University
Natural Language Processing
Y
Yi Zhou
Cardiff University
D
D. Bollegala
University of Liverpool