Crafter: A Multi-Agent Harness for Editable Scientific Figure Generation from Diverse Inputs

📅 2026-05-28

📈 Citations: 0

✨ Influential: 0

career value

168K/year

🤖 AI Summary

Existing automated chart generation systems are limited to a single chart type and plain-text input, producing non-editable bitmap outputs that fail to meet the demands of scientific research for diverse inputs and high-quality, editable visualizations. To address this, this work proposes Crafter, a multi-agent framework that leverages a novel collaborative multi-agent mechanism to uniformly handle multiple chart types and cross-modal inputs without architectural modifications. Complementing Crafter, CraftEditor converts bitmap outputs into structured SVGs with high fidelity, enabling fine-grained local editing. Evaluated on PaperBanana-Bench and the newly introduced CraftBench, the proposed approach significantly outperforms existing methods, achieving state-of-the-art SVG generation quality across all baselines, with ablation studies confirming the effectiveness of each component.

📝 Abstract

Scientific figures are among the most effective means of communicating complex research ideas, yet producing publication-quality illustrations remains one of the most labor-intensive parts of paper preparation. Existing automated systems each target a single figure type under text-only input, leaving the diversity of types and conditions researchers actually use unaddressed; their raster outputs further cannot be locally revised. Because scientific figures are structured compositions of discrete semantic components, the localized errors generators produce on such layouts demand not a stronger backbone but a harness. We instantiate this harness in two complementary systems: Crafter, a multi-agent harness for figure generation that generalizes across figure types and input conditions without architectural changes, and CraftEditor, which applies the same pattern to convert raster outputs into editable SVGs. Moreover, we introduce CraftBench, a benchmark spanning three figure types and four input conditions with human quality annotation. Experiments show that Crafter substantially outperforms both standalone generators and the agentic baseline on PaperBanana-Bench and CraftBench, with ablations confirming each component's independent contribution; CraftEditor faithfully converts outputs into editable SVGs that surpass all baselines. Our code and benchmark are available at https://github.com/HaozheZhao/Crafter.

Problem

Research questions and friction points this paper is trying to address.

scientific figure generation

multi-agent system

editable output

diverse inputs

raster-to-vector conversion

Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-agent harness

editable scientific figure generation

SVG conversion