Agentic Mixture-of-Workflows for Multi-Modal Chemical Search

📅 2025-02-26

📈 Citations: 0

✨ Influential: 0

career value

224K/year

🤖 AI Summary

Large language models (LLMs) in materials science are hindered by the absence of standardized benchmarks and scalable, modular frameworks for chemical reasoning. Method: This paper introduces CRAG-MoW—a composable, self-correcting, multi-agent retrieval-augmented generation framework. It unifies small-molecule, polymer, reaction, and multimodal NMR spectral retrieval; orchestrates heterogeneous agents to fuse multi-strategy outputs; and integrates open-source LLMs with unified CRAG workflows, multimodal retrieval, and dynamic self-correction. Contribution/Results: CRAG-MoW enables interpretable, cross-model and cross-task evaluation. Experiments demonstrate performance on par with GPT-4o across multiple chemical search tasks, higher user preference, and—critically—the first systematic characterization of architecture-performance trade-offs across distinct data modalities (e.g., SMILES, IUPAC names, spectra), revealing modality-dependent efficacy patterns of AI models in chemistry.

Technology Category

Application Category

📝 Abstract

The vast and complex materials design space demands innovative strategies to integrate multidisciplinary scientific knowledge and optimize materials discovery. While large language models (LLMs) have demonstrated promising reasoning and automation capabilities across various domains, their application in materials science remains limited due to a lack of benchmarking standards and practical implementation frameworks. To address these challenges, we introduce Mixture-of-Workflows for Self-Corrective Retrieval-Augmented Generation (CRAG-MoW) - a novel paradigm that orchestrates multiple agentic workflows employing distinct CRAG strategies using open-source LLMs. Unlike prior approaches, CRAG-MoW synthesizes diverse outputs through an orchestration agent, enabling direct evaluation of multiple LLMs across the same problem domain. We benchmark CRAG-MoWs across small molecules, polymers, and chemical reactions, as well as multi-modal nuclear magnetic resonance (NMR) spectral retrieval. Our results demonstrate that CRAG-MoWs achieve performance comparable to GPT-4o while being preferred more frequently in comparative evaluations, highlighting the advantage of structured retrieval and multi-agent synthesis. By revealing performance variations across data types, CRAG-MoW provides a scalable, interpretable, and benchmark-driven approach to optimizing AI architectures for materials discovery. These insights are pivotal in addressing fundamental gaps in benchmarking LLMs and autonomous AI agents for scientific applications.

Problem

Research questions and friction points this paper is trying to address.

Optimize materials discovery workflows

Benchmark large language models

Enhance multi-modal chemical search

Innovation

Methods, ideas, or system contributions that make the work stand out.

Mixture-of-Workflows for CRAG

Multi-agent synthesis approach

Benchmark-driven AI optimization

🔎 Similar Papers

An Autonomous Large Language Model Agent for Chemical Literature Data Mining

2024-02-20arXiv.orgCitations: 5

Chemist-X: Large Language Model-empowered Agent for Reaction Condition Recommendation in Chemical Synthesis

2023-11-16Citations: 8