Toward Reliable Scientific Visualization Pipeline Construction with Structure-Aware Retrieval-Augmented LLMs

📅 2026-03-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the unreliability of natural language–to–executable scientific visualization pipelines in web environments, where failures often stem from missing stages, misused operators, or incorrect sequencing. To mitigate these issues, the authors propose a structure-aware retrieval-augmented generation (RAG) approach that retrieves vtk.js code examples aligned with the target pipeline’s structural schema and uses them as contextual guidance for large language models. This enables accurate module selection, parameter configuration, and execution ordering. The study introduces a novel metric—“correction cost”—to quantify the degree of human intervention required and implements an interactive analysis interface to facilitate human–AI collaborative evaluation. Experimental results demonstrate that the proposed method significantly enhances the executability and practical utility of generated pipelines while effectively reducing correction cost.

Technology Category

Application Category

📝 Abstract
Scientific visualization pipelines encode domain-specific procedural knowledge with strict execution dependencies, making their construction sensitive to missing stages, incorrect operator usage, or improper ordering. Thus, generating executable scientific visualization pipelines from natural-language descriptions remains challenging for large language models, particularly in web-based environments where visualization authoring relies on explicit code-level pipeline assembly. In this work, we investigate the reliability of LLM-based scientific visualization pipeline generation, focusing on vtk.js as a representative web-based visualization library. We propose a structure-aware retrieval-augmented generation workflow that provides pipeline-aligned vtk.js code examples as contextual guidance, supporting correct module selection, parameter configuration, and execution order. We evaluate the proposed workflow across multiple multi-stage scientific visualization tasks and LLMs, measuring reliability in terms of pipeline executability and human correction effort. To this end, we introduce correction cost as metric for the amount of manual intervention required to obtain a valid pipeline. Our results show that structured, domain-specific context substantially improves pipeline executability and reduces correction cost. We additionally provide an interactive analysis interface to support human-in-the-loop inspection and systematic evaluation of generated visualization pipelines.
Problem

Research questions and friction points this paper is trying to address.

scientific visualization
pipeline generation
large language models
vtk.js
reliability
Innovation

Methods, ideas, or system contributions that make the work stand out.

structure-aware retrieval
retrieval-augmented generation
scientific visualization pipeline
vtk.js
correction cost
🔎 Similar Papers
No similar papers found.
G
Guanghui Zhao
Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences; University of Chinese Academy of Sciences
Zhe Wang
Zhe Wang
Computer Network Information Center (CNIC), CAS
Scientific VisualizationIn Situ WorkflowHigh Performance Computing
Yu Dong
Yu Dong
Computer Network Information Center, Chinese Academy of Sciences
Visual AnalyticsHuman-Computer Interaction
G
Guan Li
University of Chinese Academy of Sciences; Computer Network Information Center, Chinese Academy of Sciences
G
GuiHua Shan
Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences; University of Chinese Academy of Sciences; Computer Network Information Center, Chinese Academy of Sciences