🤖 AI Summary
Traditional vector retrieval struggles with complex queries involving logical operators (e.g., NOT, AND, OR), lacking explicit logical reasoning capabilities. This paper introduces LogicIR, the first retrieval-time logical reasoning framework that explicitly parses the logical structure of natural language queries during retrieval. LogicIR employs a rule-driven compilation process to generate executable logical expressions and embeds cosine similarity into weighted logical combination functions—specifically tailored for AND, OR, and NOT operations—enabling end-to-end, fine-tuning-free, and re-ranking-free logic-aware retrieval. Crucially, LogicIR unifies logical structure extraction and dynamic score composition within the retrieval pipeline itself. Evaluated on both synthetic and real-world benchmarks, it achieves significant improvements in mean average precision (mAP) for complex queries while maintaining millisecond-scale latency.
📝 Abstract
Traditional retrieval methods rely on transforming user queries into vector representations and retrieving documents based on cosine similarity within an embedding space. While efficient and scalable, this approach often fails to handle complex queries involving logical constructs such as negations, conjunctions, and disjunctions. In this paper, we propose a novel inference-time logical reasoning framework that explicitly incorporates logical reasoning into the retrieval process. Our method extracts logical reasoning structures from natural language queries and then composes the individual cosine similarity scores to formulate the final document scores. This approach enables the retrieval process to handle complex logical reasoning without compromising computational efficiency. Our results on both synthetic and real-world benchmarks demonstrate that the proposed method consistently outperforms traditional retrieval methods across different models and datasets, significantly improving retrieval performance for complex queries.