A Comprehensive Taxonomy of Negation for NLP and Neural Retrievers

📅 2025-07-29

📈 Citations: 0

✨ Influential: 0

career value

141K/year

🤖 AI Summary

This work addresses the limited reasoning capability of neural information retrieval (IR) models and large language models (LLMs) in handling negation-bearing queries. We propose the first systematic negation taxonomy—grounded in philosophy, linguistics, and formal logic—to expose structural biases in existing IR datasets regarding negation type coverage. Leveraging this taxonomy, we construct NevIR, a manually annotated, type-balanced benchmark dataset for negation-aware IR. We further design a logic-driven negation classification mechanism and a targeted fine-tuning strategy. Experiments demonstrate that our approach significantly improves both accuracy and convergence speed on negation queries, achieving state-of-the-art performance on NevIR. Beyond empirical gains, this work establishes an interpretable analytical framework for negation modeling in IR and advances the generalization capacity of models on complex logical reasoning tasks.

Technology Category

Application Category

📝 Abstract

Understanding and solving complex reasoning tasks is vital for addressing the information needs of a user. Although dense neural models learn contextualised embeddings, they still underperform on queries containing negation. To understand this phenomenon, we study negation in both traditional neural information retrieval and LLM-based models. We (1) introduce a taxonomy of negation that derives from philosophical, linguistic, and logical definitions; (2) generate two benchmark datasets that can be used to evaluate the performance of neural information retrieval models and to fine-tune models for a more robust performance on negation; and (3) propose a logic-based classification mechanism that can be used to analyze the performance of retrieval models on existing datasets. Our taxonomy produces a balanced data distribution over negation types, providing a better training setup that leads to faster convergence on the NevIR dataset. Moreover, we propose a classification schema that reveals the coverage of negation types in existing datasets, offering insights into the factors that might affect the generalization of fine-tuned models on negation.

Problem

Research questions and friction points this paper is trying to address.

Studying negation in neural retrieval and LLM models

Creating taxonomy and benchmarks for negation evaluation

Improving model performance on queries with negation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces a negation taxonomy from multiple disciplines

Generates benchmark datasets for negation evaluation

Proposes logic-based classification for retrieval analysis

🔎 Similar Papers

No similar papers found.