Resource-Efficient Adaptation of Large Language Models for Text Embeddings via Prompt Engineering and Contrastive Fine-tuning

📅 2025-07-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the lack of high-quality, controllable sentence-level embeddings from large language models (LLMs) for non-generative tasks—such as clustering, classification, and retrieval—this paper proposes a unified framework integrating prompt engineering, contrastive fine-tuning, and semantic-aware aggregation. Specifically, it introduces task-oriented prompt templates, synthesizes positive pairs to drive contrastive learning, and employs attention-based token-level vector weighting for sentence embedding aggregation—thereby preserving salient semantics while suppressing noise. The method requires only lightweight fine-tuning of decoder-only LLMs and is validated via attention analysis to confirm enhanced semantic focus. Evaluated on the MTEB English clustering benchmark, it achieves state-of-the-art performance, significantly outperforming mainstream embedding models. Results demonstrate the framework’s effectiveness, robustness, and practical utility for sentence embedding generation in non-generative downstream applications.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) have become a cornerstone in Natural Language Processing (NLP), achieving impressive performance in text generation. Their token-level representations capture rich, human-aligned semantics. However, pooling these vectors into a text embedding discards crucial information. Nevertheless, many non-generative downstream tasks, such as clustering, classification, or retrieval, still depend on accurate and controllable sentence- or document-level embeddings. We explore several adaptation strategies for pre-trained, decoder-only LLMs: (i) various aggregation techniques for token embeddings, (ii) task-specific prompt engineering, and (iii) text-level augmentation via contrastive fine-tuning. Combining these components yields state-of-the-art performance on the English clustering track of the Massive Text Embedding Benchmark (MTEB). An analysis of the attention map further shows that fine-tuning shifts focus from prompt tokens to semantically relevant words, indicating more effective compression of meaning into the final hidden state. Our experiments demonstrate that LLMs can be effectively adapted as text embedding models through a combination of prompt engineering and resource-efficient contrastive fine-tuning on synthetically generated positive pairs.
Problem

Research questions and friction points this paper is trying to address.

Improving text embeddings from token-level LLM representations
Adapting LLMs for non-generative tasks via efficient methods
Enhancing semantic compression in embeddings via contrastive fine-tuning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Token embedding aggregation techniques
Task-specific prompt engineering
Contrastive fine-tuning on synthetic pairs
🔎 Similar Papers
No similar papers found.
B
Benedikt Roth
fortiss GmbH, Munich, Germany
S
Stephan Rappensperger
fortiss GmbH, Munich, Germany; School of Computation and Information Technology, Technical University of Munich, Germany
Tianming Qiu
Tianming Qiu
fortiss / Technical University of Munich
Machine Learning
H
Hamza Imamović
fortiss GmbH, Munich, Germany; School of Computation and Information Technology, Technical University of Munich, Germany
J
Julian Wörmann
fortiss GmbH, Munich, Germany; School of Computation and Information Technology, Technical University of Munich, Germany
H
Hao Shen
fortiss GmbH, Munich, Germany; School of Computation and Information Technology, Technical University of Munich, Germany