🤖 AI Summary
This work addresses the common oversight of internal document structure in existing document-based question answering systems, which often leads to incoherent retrieval and generation. The authors propose SF-Re2G, a novel framework that systematically integrates hierarchical document structure throughout the entire pipeline—retrieval, reranking, and generation. Specifically, structure-aware contrastive learning is introduced during retrieval to enhance paragraph representations within the same section; during reranking, a subgraph based on structural proximity is constructed to aggregate contextual information from neighboring passages; and finally, this subgraph context guides answer generation. Experimental results demonstrate that SF-Re2G significantly improves both retrieval and generation performance on Chinese and English document dialogue benchmarks, confirming the effectiveness and generalizability of leveraging structural information.
📝 Abstract
Document-grounded dialogue systems (DGDS) utilize knowledge from external documents to answer domain-specific user questions. Existing solutions typically divide documents into independent passages for retrieval and response generation. This approach, however, neither makes good use of structural information within documents nor provides enough (document) context for knowledge selection and responses. This paper proposes SF-Re2G to address such issues systematically. Firstly, we seek to improve a passage representation by contrasting it with others of the same section, thus improving the retrieval performance. Secondly, a structure-enhanced reranker is built, leveraging the fact that multiple grounding passages of one dialog turn tend to be in the same neighborhood. Specifically, candidates from the retrieval are grouped into subgraphs according to the document structure. The reranker will rescore the candidate integrating its group information. Finally, the chosen passages are used for responses, taking into account the subgraph context for better generation. Experimental results on two DGDS datasets validate our method for both Chinese and English.