Revealing the Numeracy Gap: An Empirical Investigation of Text Embedding Models

📅 2025-09-06

📈 Citations: 0

✨ Influential: 0

career value

158K/year

🤖 AI Summary

This work identifies a systemic deficiency in mainstream text embedding models: their inability to reliably encode numerical information—despite widespread deployment in numerically sensitive domains such as finance and healthcare. Specifically, these models frequently fail to distinguish semantically critical yet lexically similar numeric expressions (e.g., “increased by 2%” vs. “increased by 20%”). To rigorously assess this limitation, we introduce the first evaluation framework explicitly designed for numerical sensitivity, built upon controllably synthesized financial texts. We conduct similarity analysis and controlled-variable experiments across 13 state-of-the-art embedding models. Results consistently reveal a pronounced “numerical gap”: embedding distances exhibit weak or no correlation with underlying numeric magnitudes. This study provides the first systematic quantification of embedding models’ numerical awareness deficits, establishing both theoretical insight and empirical evidence to guide the development of more reliable NLP systems for numeric-intensive applications.

Technology Category

Application Category

📝 Abstract

Text embedding models are widely used in natural language processing applications. However, their capability is often benchmarked on tasks that do not require understanding nuanced numerical information in text. As a result, it remains unclear whether current embedding models can precisely encode numerical content, such as numbers, into embeddings. This question is critical because embedding models are increasingly applied in domains where numbers matter, such as finance and healthcare. For example, Company X's market share grew by 2% should be interpreted very differently from Company X's market share grew by 20%, even though both indicate growth in market share. This study aims to examine whether text embedding models can capture such nuances. Using synthetic data in a financial context, we evaluate 13 widely used text embedding models and find that they generally struggle to capture numerical details accurately. Our further analyses provide deeper insights into embedding numeracy, informing future research to strengthen embedding model-based NLP systems with improved capacity for handling numerical content.

Problem

Research questions and friction points this paper is trying to address.

Investigating text embedding models' numerical encoding accuracy

Assessing nuanced numerical understanding in NLP applications

Evaluating embedding performance on financial domain numeracy

Innovation

Methods, ideas, or system contributions that make the work stand out.

Evaluating text embedding models numerically

Using synthetic financial data for testing

Identifying gaps in numerical nuance capture

🔎 Similar Papers

No similar papers found.