🤖 AI Summary
This work addresses the challenge of generating lay-friendly summaries for specialized academic literature (e.g., biomedical and NLP papers) in zero-shot settings. We propose the first two-stage prompting framework grounded in authentic scientific writing practice: (1) key information extraction, followed by (2) reformulation into accessible language for non-expert readers. Our method requires no domain-specific fine-tuning or annotated data, relying solely on large language models (LLMs) and structured zero-shot prompts. Contributions are threefold: (1) the first demonstration of cross-domain zero-shot transfer (biomedical → NLP) for lay summarization; (2) empirical validation that LLMs serve as highly reliable automatic evaluators—correlating strongly with human preferences (Spearman ρ > 0.9); and (3) significant improvement in human preference scores for generated summaries. Experiments confirm the framework’s generalizability, reliability, and practical utility across domains.
📝 Abstract
In this work, we explore the application of Large Language Models to zero-shot Lay Summarisation. We propose a novel two-stage framework for Lay Summarisation based on real-life processes, and find that summaries generated with this method are increasingly preferred by human judges for larger models. To help establish best practices for employing LLMs in zero-shot settings, we also assess the ability of LLMs as judges, finding that they are able to replicate the preferences of human judges. Finally, we take the initial steps towards Lay Summarisation for Natural Language Processing (NLP) articles, finding that LLMs are able to generalise to this new domain, and further highlighting the greater utility of summaries generated by our proposed approach via an in-depth human evaluation.