🤖 AI Summary
This work addresses a critical limitation in traditional retrieval systems, which conflate semantic similarity with relevance and consequently fail to handle queries involving negation or exclusion constraints, often returning semantically similar but constraint-violating documents. To overcome this, the authors propose CoDeR, a novel approach that explicitly disentangles topical relevance from constraint compatibility. CoDeR employs a standard dense encoder to ensure effective topic-based recall while introducing a dual-encoder compatibility scoring module trained via contrastive learning with lexical polarity supervision. Notably, this framework achieves efficient constraint-aware ranking without relying on large language models. Experimental results demonstrate that CoDeR substantially delays the first appearance of violating documents across three diagnostic datasets, reducing the V@2 metric by 20.59, 23.53, and 5.77 points respectively, thereby significantly enhancing constraint satisfaction in retrieval.
📝 Abstract
Information retrieval systems have long treated semantic similarity as a proxy for relevance. For constraint-sensitive queries, this proxy can fail when a document is topically close to the query but supports the opposite constraint direction, such as satisfying an attribute that should be excluded or affirming a relation that should be negated. We study this failure as constraint-violating evidence exposure and propose CoDeR, a local constraint-compatible dense retrieval method that separates topical relevance from constraint compatibility. CoDeR keeps a standard topical encoder for candidate coverage and adds a compatibility scorer, implemented as a bi-encoder, trained with lexical-polarity supervision over contrastive satisfying and violating evidences. The compatibility signal can be used to rescore topical candidates or to retrieve an auxiliary compatibility-oriented candidate set, producing a ranked document list without external Large Language Model~(LLM) calls at inference time. We evaluate CoDeR on controlled diagnostics and public negative-constraint retrieval benchmarks. Across three controlled diagnostic sets targeting antonymy, negation, and exclusion, CoDeR reduces V@2 by 20.59, 23.53, and 5.77 points relative to the strongest non-CoDeR baselines, and improves FVR by pushing the first violating document deeper in the ranking.