🤖 AI Summary
Existing automated survey generation methods treat outline construction as a templated subtask, leading to superficial topic understanding and coarse stylistic expression. To address this, we propose a metadata-driven end-to-end academic survey outline generation framework that abandons conventional pipeline architectures. Our approach employs hierarchical structural modeling and fine-grained stylistic control to achieve deep semantic comprehension of research topics and coherent, consistent expression. We adopt a two-stage training strategy—supervised fine-tuning followed by reinforcement learning—and validate the framework using a high-quality, curated arXiv survey dataset alongside systematic evaluation metrics. Experiments demonstrate that our 8B-parameter model significantly outperforms baselines in structural fidelity and stylistic coherence, markedly improving the organization, faithfulness, and practical utility of automatically generated outlines.
📝 Abstract
As academic paper publication numbers grow exponentially, conducting in-depth surveys with LLMs automatically has become an inevitable trend. Outline writing, which aims to systematically organize related works, is critical for automated survey generation. Yet existing automatic survey methods treat outline writing as mere workflow steps in the overall pipeline. Such template-based workflows produce outlines that lack in-depth understanding of the survey topic and fine-grained styles. To address these limitations, we propose Meow, the first metadata-driven outline writing framework that produces organized and faithful outlines efficiently. Specifically, we first formulate outline writing as an end-to-end task that generates hierarchical structured outlines from paper metadata. We then curate a high-quality dataset of surveys from arXiv, bioRxiv, and medRxiv, and establish systematic evaluation metrics for outline quality assessment. Finally, we employ a two-stage training approach combining supervised fine-tuning and reinforcement learning. Our 8B reasoning model demonstrates strong performance with high structural fidelity and stylistic coherence.