🤖 AI Summary
This work addresses the limited factuality and interpretability of existing retrieval-augmented generation (RAG) systems, which often rely solely on topical similarity. The authors propose CERA, a novel framework that integrates subjectivity-driven hard negative sampling and human-annotated factual rationales weighted by part-of-speech tags into contrastive learning. To inject an inductive bias toward evidence grounding, they design a CLS-to-token attention alignment loss. By combining triplet contrastive learning with dense retriever fine-tuning, CERA steers the model to attend to specific evidential spans. Evaluated on a clinical trial report dataset, CERA substantially outperforms Contriever and current hard negative mining approaches, simultaneously enhancing retrieval effectiveness, factual faithfulness, and interpretability of generated outputs.
📝 Abstract
Ensuring factuality and interpretability in RAG remains an open and urgent problem. We introduce Contrastive Evidence Rationale Attention (CERA), the first retrieval framework to employ subjectivity-based hard negative selection and inject an evidential inductive bias into contrastive learning through an auxiliary attention alignment loss. CERA fine-tunes a dense retriever using two training objectives: triplet-based contrastive learning and interpretable attention alignment, which supervises CLS-to-token attention using a part-of-speech-weighted masking distribution over human-annotated factual rationales as evidence signals. Experiments on a large corpus of clinical trial reports demonstrate that the subjectivity-based hard negative selection substantially improves retrieval effectiveness compared to both Contriever and hard negative selection baselines. Furthermore, rationale alignment improves faithfulness while maintaining competitive retrieval performance, supporting the hypothesis that attention can serve as a more faithful explanation of model behavior when guided by human rationales. Moving beyond topical similarity, CERA enables the retriever to identify the specific tokens that constitute supporting evidence, promoting more interpretable evidence selection in RAG systems.