PPTAgent: Generating and Evaluating Presentations Beyond Text-to-Slides

📅 2025-01-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing document-to-presentation generation methods largely neglect visual design principles and structural coherence, limiting their practical utility. To address this, we propose an end-to-end, two-stage editing-based generation framework: Stage I leverages large language models to learn document structural patterns and infer design constraints; Stage II employs code-driven atomic editing actions to jointly optimize content selection, layout arrangement, and stylistic consistency across slides. Furthermore, we introduce PPTEval—the first three-dimensional evaluation benchmark covering content accuracy, visual design合理性, and cross-slide coherence—equipped with interpretable, multi-faceted metrics. Extensive experiments demonstrate that our method significantly outperforms state-of-the-art baselines across all three dimensions. The source code, dataset, and evaluation toolkit are fully open-sourced.

Technology Category

Application Category

📝 Abstract
Automatically generating presentations from documents is a challenging task that requires balancing content quality, visual design, and structural coherence. Existing methods primarily focus on improving and evaluating the content quality in isolation, often overlooking visual design and structural coherence, which limits their practical applicability. To address these limitations, we propose PPTAgent, which comprehensively improves presentation generation through a two-stage, edit-based approach inspired by human workflows. PPTAgent first analyzes reference presentations to understand their structural patterns and content schemas, then drafts outlines and generates slides through code actions to ensure consistency and alignment. To comprehensively evaluate the quality of generated presentations, we further introduce PPTEval, an evaluation framework that assesses presentations across three dimensions: Content, Design, and Coherence. Experiments show that PPTAgent significantly outperforms traditional automatic presentation generation methods across all three dimensions. The code and data are available at https://github.com/icip-cas/PPTAgent.
Problem

Research questions and friction points this paper is trying to address.

Automatic Presentation Creation
Content Consistency
Visual Consistency
Innovation

Methods, ideas, or system contributions that make the work stand out.

PPTAgent
PPTEval
Quality Evaluation
🔎 Similar Papers
No similar papers found.
H
Hao Zheng
Chinese Information Processing Laboratory, Institute of Software, Chinese Academy of Sciences; University of Chinese Academy of Sciences
Xinyan Guan
Xinyan Guan
Institute of Software, Chinese Academy of Sciences
H
Hao Kong
Shanghai Jiexin Technology
J
Jia Zheng
Chinese Information Processing Laboratory, Institute of Software, Chinese Academy of Sciences
H
Hongyu Lin
Chinese Information Processing Laboratory, Institute of Software, Chinese Academy of Sciences
Yaojie Lu
Yaojie Lu
Institute of Software, Chinese Academy of Sciences
Information ExtractionLarge Language Models
Ben He
Ben He
Professor, University of Chinese Academy of Sciences
Natural Language ProcessingInformation Retrieval
X
Xianpei Han
Chinese Information Processing Laboratory, Institute of Software, Chinese Academy of Sciences
Le Sun
Le Sun
Institute of Software, CAS
information_retrievalnatural_language_processing