🤖 AI Summary
Conventional template-based methods struggle to discover novel retrosynthetic pathways for complex molecules. Method: This paper proposes an iterative human–AI collaborative framework integrating large language model (LLM) reasoning with chemical prior knowledge. It combines a reaction template engine, Monte Carlo tree search, and LLM-based pathway generation, augmented with recursive feedback, hallucination detection, and stability validation—enabled via a graphical user interface for closed-loop interactive reasoning. Contribution/Results: To our knowledge, this is the first approach to deeply couple multi-step logical reasoning of LLMs with chemically valid constraints, enabling autonomous pathway exploration and dynamic refinement. Evaluated on multiple natural product benchmarks, it generates high-feasibility, structurally novel multi-step retrosynthetic routes, significantly outperforming both pure template-based and purely generative baselines—demonstrating its novelty and practical utility.
📝 Abstract
Retrosynthesis, the identification of precursor molecules for a target compound, is pivotal for synthesizing complex molecules, but faces challenges in discovering novel pathways beyond predefined templates. Recent large language model (LLM) approaches to retrosynthesis have shown promise but effectively harnessing LLM reasoning capabilities for effective multi-step planning remains an open question. To address this challenge, we introduce DeepRetro, an open-source, iterative, hybrid LLM-based retrosynthetic framework. Our approach integrates the strengths of conventional template-based/Monte Carlo tree search tools with the generative power of LLMs in a step-wise, feedback-driven loop. Initially, synthesis planning is attempted with a template-based engine. If this fails, the LLM subsequently proposes single-step retrosynthetic disconnections. Crucially, these suggestions undergo rigorous validity, stability, and hallucination checks before the resulting precursors are recursively fed back into the pipeline for further evaluation. This iterative refinement allows for dynamic pathway exploration and correction. We demonstrate the potential of this pipeline through benchmark evaluations and case studies, showcasing its ability to identify viable and potentially novel retrosynthetic routes. In particular, we develop an interactive graphical user interface that allows expert human chemists to provide human-in-the-loop feedback to the reasoning algorithm. This approach successfully generates novel pathways for complex natural product compounds, demonstrating the potential for iterative LLM reasoning to advance state-of-art in complex chemical syntheses.