🤖 AI Summary
This study investigates the impact of contextual information on Discourse Relation Classification (DRC) in scientific texts, aiming to enhance the evidence traceability and interpretability of AI-generated scientific claims. Addressing the limitation of prior DRC research—its neglect of domain-specific scientific context—we systematically analyze the differential contextual dependency across discourse relation types for the first time and propose a discourse-structure-aware context modeling strategy. We evaluate both pretrained language models and large language models on scientific publication data. Experimental results demonstrate that incorporating context consistently improves classification performance (average +3.2% F1), with significantly higher dependency observed for logically complex relations such as causality and contrast. Our work provides theoretical grounding and methodological foundations for scientific literature understanding, evidence chain construction, and improving the reasoning interpretability of generative AI systems.
📝 Abstract
With the increasing use of generative Artificial Intelligence (AI) methods to support science workflows, we are interested in the use of discourse-level information to find supporting evidence for AI generated scientific claims. A first step towards this objective is to examine the task of inferring discourse structure in scientific writing.
In this work, we present a preliminary investigation of pretrained language model (PLM) and Large Language Model (LLM) approaches for Discourse Relation Classification (DRC), focusing on scientific publications, an under-studied genre for this task. We examine how context can help with the DRC task, with our experiments showing that context, as defined by discourse structure, is generally helpful. We also present an analysis of which scientific discourse relation types might benefit most from context.