🤖 AI Summary
This work addresses a critical limitation in current abstractive summarization systems: their susceptibility to "presence bias," which causes them to reflect only explicitly mentioned content while overlooking crucial missing information, potentially misleading user decisions. To mitigate this, the authors propose DiSCo, a novel method that introduces an expectation-driven contrastive mechanism grounded in domain-typical topic reference distributions. By comparing the entity-specific content against expected domain-level discourse patterns, DiSCo automatically identifies and integrates anomalously overemphasized or omitted aspects into the generated summary. User studies across three domains—ski resorts, beaches, and downtown accommodations—demonstrate that summaries produced by DiSCo are consistently rated as more informative and decision-relevant, significantly enhancing both system transparency and practical utility for end users.
📝 Abstract
Intelligent interfaces increasingly use large language models to summarize user-generated content, yet these summaries emphasize what is mentioned while overlooking what is missing. This presence bias can mislead users who rely on summaries to make decisions. We present Domain Informed Summarization through Contrast (DiSCo), an expectation-based computational approach that makes absences visible by comparing each entity's content with domain topical expectations captured in reference distributions of aspects typically discussed in comparable accommodations. This comparison identifies aspects that are either unusually emphasized or missing relative to domain norms and integrates them into the generated text. In a user study across three accommodation domains, namely ski, beach, and city center, DiSCo summaries were rated as more detailed and useful for decision making than baseline large language model summaries, although slightly harder to read. The findings show that modeling expectations reduces presence bias and improves both transparency and decision support in intelligent summarization interfaces.