C2: Scalable Auto-Feedback for LLM-based Chart Generation

📅 2024-10-24
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language models (LLMs) face scalability bottlenecks in chart generation due to scarce labeled data and the high cost of manually constructing <instruction, data, code> triplets. Method: We propose ChartAF, a reference-free automatic feedback framework, and ChartUIE-8K, a large-scale, diverse instruction dataset. ChartAF introduces the first LLM-based self-feedback distillation mechanism, integrated with instruction augmentation and multi-dimensional diversity construction—spanning queries, datasets, and chart types. Contribution/Results: ChartAF outperforms nine baselines significantly; ChartUIE-8K increases query coverage, dataset variety, and chart-type diversity by 59.8×, 19.4×, and 1.9×, respectively. User studies show 74% strong preference for feedback-optimized charts and 94% agreement on real-world applicability. This work establishes the first high-quality, closed-loop chart generation system without requiring gold-standard references, advancing LLM-based charting toward practical deployment.

Technology Category

Application Category

📝 Abstract
Generating high-quality charts with Large Language Models (LLMs) presents significant challenges due to limited data and the high cost of scaling through human curation. $langle ext{instruction}, ext{data}, ext{code} angle$ triplets are scarce and expensive to manually curate as their creation demands technical expertise. To address this scalability challenge, we introduce a reference-free automatic feedback generator, which eliminates the need for costly human intervention. Our novel framework, C$^2$, consists of (1) an automatic feedback provider (ChartAF) and (2) a diverse, reference-free dataset (ChartUIE-8K). The results are compelling: in our first experiment, 74% of respondents strongly preferred, and 10% preferred, the results after feedback. The second post-feedback experiment demonstrates that ChartAF outperform nine baselines. Moreover, ChartUIE-8K significantly improves data diversity by increasing queries, datasets, and chart types by 5982%, 1936%, and 91%, respectively, over benchmarks. Finally, a study of LLM users revealed that 94% of participants preferred ChartUIE-8K's queries, with 93% deeming them aligned with real-world use cases. Core contributions are available as open-source at chartsquared.github.io, with ample qualitative examples.
Problem

Research questions and friction points this paper is trying to address.

Scalable auto-feedback for LLM chart generation
Reducing human curation cost in chart creation
Enhancing data diversity with ChartUIE-8K
Innovation

Methods, ideas, or system contributions that make the work stand out.

Scalable auto-feedback for LLMs
Reference-free automatic feedback generator
Diverse, reference-free dataset ChartUIE-8K
🔎 Similar Papers
No similar papers found.
Woosung Koh
Woosung Koh
Trillion Labs, KAIST AI
Foundation ModelsAgents
J
Jang Han Yoon
Yonsei University
M
MinHyung Lee
Yonsei University
Y
Youngjin Song
Yonsei University
J
Jaegwan Cho
Yonsei University
J
Jaehyun Kang
Yonsei University
T
Taehyeon Kim
KAIST AI
S
Se-young Yun
KAIST AI
Y
Youngjae Yu
Yonsei University
Bongshin Lee
Bongshin Lee
Yonsei University
Human-Data InteractionInformation VisualizationHuman-Computer InteractionHuman-AI Interaction