🤖 AI Summary
Existing genomic visualization tools often struggle to balance high customizability with usability, typically requiring users to possess programming or visualization expertise. This work proposes a large language model (LLM)-based multi-agent system that generates genomic-compliant, multi-view interactive visualizations—expressed in Gosling syntax—through natural language interaction. The study presents the first systematic evaluation of LLM failure modes in this domain and introduces an eight-dimensional quality assessment framework. Evaluated on 159 test cases, the agent-based iterative generation strategy significantly outperforms both direct generation and fixed-pipeline approaches. Notably, more complex agent architectures confer no additional benefit, demonstrating that structured, yet streamlined, agent strategies are effective for domain-specific visualization synthesis in genomics.
📝 Abstract
Diverse genomics data, scientific questions, and analysis tasks typically demand highly specialized visualizations. Therefore, users often must customize or author new ones tailored to their data. Existing tools are usually either limited in customization or require substantial learning or programming, and even expressive tools assume visualization expertise many users lack. Agentic and large language model (LLM) approaches are increasingly applied to complex scientific tasks, including visualization. Natural-language conversational interfaces offer a promising path to democratizing the authoring of complex visualizations. In the context of genomics, these approaches face additional challenges: genomics visualizations typically integrate heterogeneous data types and are composed of multiple linked interactive views. These challenges motivate more structured LLM-based schemes. We first characterize where vanilla LLM generation succeeds and fails for genomics visualization, identifying eight quality dimensions. We then compare six schemes--direct generation, a fixed pipeline, and four agentic configurations varying in the number of specialist agents and the presence of a reviewer--across 159 cases spanning three levels of query ambiguity and specification complexity. All schemes use the Gosling visualization grammar as structured output. Agentic iteration substantially improves perceived quality over both baselines, while more complex agent architectures yield no additional benefit. We discuss implications for designing agentic systems for domain-specific visualization authoring. All supplemental materials are available at https://osf.io/uqe83.