DomainCQA: Crafting Expert-Level QA from Domain-Specific Charts

📅 2025-03-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing general-purpose Chart Question Answering (CQA) benchmarks inadequately evaluate multimodal large language models’ (MLLMs) capacity for deep reasoning that integrates visual information with domain-specific knowledge. To address this, we propose DomainCQA—a scalable, systematic methodology for constructing domain-specialized CQA benchmarks—and instantiate it in astronomy via AstroChart, which incorporates expert annotation, chart semantic parsing, and cross-modal alignment. Experiments identify MLLMs’ core bottlenecks: chart-hopping reasoning, joint analysis of multiple charts, and domain-knowledge-guided summarization—rather than mere factual recall. AstroChart establishes the first rigorous, reproducible evaluation standard for domain-specialized MLLMs, advancing multimodal model assessment toward professional application scenarios.

Technology Category

Application Category

📝 Abstract
Chart Question Answering (CQA) benchmarks are essential for evaluating the capability of Multimodal Large Language Models (MLLMs) to interpret visual data. However, current benchmarks focus primarily on the evaluation of general-purpose CQA but fail to adequately capture domain-specific challenges. We introduce DomainCQA, a systematic methodology for constructing domain-specific CQA benchmarks, and demonstrate its effectiveness by developing AstroChart, a CQA benchmark in the field of astronomy. Our evaluation shows that chart reasoning and combining chart information with domain knowledge for deeper analysis and summarization, rather than domain-specific knowledge, pose the primary challenge for existing MLLMs, highlighting a critical gap in current benchmarks. By providing a scalable and rigorous framework, DomainCQA enables more precise assessment and improvement of MLLMs for domain-specific applications.
Problem

Research questions and friction points this paper is trying to address.

Evaluating MLLMs' ability to interpret domain-specific charts
Addressing gaps in current general-purpose CQA benchmarks
Challenges in chart reasoning and domain knowledge integration
Innovation

Methods, ideas, or system contributions that make the work stand out.

Domain-specific CQA benchmark construction methodology
AstroChart for astronomy QA evaluation
Scalable framework for MLLM assessment
🔎 Similar Papers
No similar papers found.
L
Ling Zhong
Zhejiang Lab
Y
Yujing Lu
Zhejiang Lab
J
Jing Yang
Zhejiang Lab
Weiming Li
Weiming Li
Principal Engineer, Samsung Electronics
Computer VisionAugmented RealityComputational Imaging and Display
P
Peng Wei
National Astronomical Observatory, Chinese Acedemy of Science
Y
Yongheng Wang
Zhejiang Lab
M
Manni Duan
Zhejiang Lab
Q
Qing Zhang
Zhejiang Lab