DeepRetro: Retrosynthetic Pathway Discovery using Iterative LLM Reasoning

📅 2025-07-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Conventional template-based methods struggle to discover novel retrosynthetic pathways for complex molecules. Method: This paper proposes an iterative human–AI collaborative framework integrating large language model (LLM) reasoning with chemical prior knowledge. It combines a reaction template engine, Monte Carlo tree search, and LLM-based pathway generation, augmented with recursive feedback, hallucination detection, and stability validation—enabled via a graphical user interface for closed-loop interactive reasoning. Contribution/Results: To our knowledge, this is the first approach to deeply couple multi-step logical reasoning of LLMs with chemically valid constraints, enabling autonomous pathway exploration and dynamic refinement. Evaluated on multiple natural product benchmarks, it generates high-feasibility, structurally novel multi-step retrosynthetic routes, significantly outperforming both pure template-based and purely generative baselines—demonstrating its novelty and practical utility.

Technology Category

Application Category

📝 Abstract
Retrosynthesis, the identification of precursor molecules for a target compound, is pivotal for synthesizing complex molecules, but faces challenges in discovering novel pathways beyond predefined templates. Recent large language model (LLM) approaches to retrosynthesis have shown promise but effectively harnessing LLM reasoning capabilities for effective multi-step planning remains an open question. To address this challenge, we introduce DeepRetro, an open-source, iterative, hybrid LLM-based retrosynthetic framework. Our approach integrates the strengths of conventional template-based/Monte Carlo tree search tools with the generative power of LLMs in a step-wise, feedback-driven loop. Initially, synthesis planning is attempted with a template-based engine. If this fails, the LLM subsequently proposes single-step retrosynthetic disconnections. Crucially, these suggestions undergo rigorous validity, stability, and hallucination checks before the resulting precursors are recursively fed back into the pipeline for further evaluation. This iterative refinement allows for dynamic pathway exploration and correction. We demonstrate the potential of this pipeline through benchmark evaluations and case studies, showcasing its ability to identify viable and potentially novel retrosynthetic routes. In particular, we develop an interactive graphical user interface that allows expert human chemists to provide human-in-the-loop feedback to the reasoning algorithm. This approach successfully generates novel pathways for complex natural product compounds, demonstrating the potential for iterative LLM reasoning to advance state-of-art in complex chemical syntheses.
Problem

Research questions and friction points this paper is trying to address.

Discover novel retrosynthetic pathways beyond predefined templates
Harness LLM reasoning for effective multi-step synthesis planning
Ensure validity and stability of proposed retrosynthetic disconnections
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid LLM-based iterative retrosynthetic framework
Combines template-based and LLM generative methods
Interactive GUI for human-in-the-loop feedback
🔎 Similar Papers
No similar papers found.
S
Shreyas Vinaya Sathyanarayana
Deep Forest Sciences, California, USA; Department of Chemistry, Birla Institute of Technology & Science, Pilani, Goa, India; Department of Computer Science and Information Systems, Birla Institute of Technology & Science, Pilani, Goa, India.
R
Rahil Shah
Deep Forest Sciences, California, USA; Department of Chemistry, Birla Institute of Technology & Science, Pilani, Goa, India; Department of Computer Science and Information Systems, Birla Institute of Technology & Science, Pilani, Goa, India.
S
Sharanabasava D. Hiremath
Deep Forest Sciences, California, USA; Department of Chemical Sciences, IISER Kolkata, West Bengal, India.
Rishikesh Panda
Rishikesh Panda
Deep Forest Sciences, California, USA; Department of Biology, Birla Institute of Technology & Science, Pilani, Goa, India; Department of Electrical & Electronics Engineering, Birla Institute of Technology & Science, Pilani, Goa, India.
R
Rahul Jana
Deep Forest Sciences, California, USA
Riya Singh
Riya Singh
Graduate student, Purdue university
CosmologyDark MatterSimulation Frameworks
R
Rida Irfan
Deep Forest Sciences, California, USA
A
Ashwin Murali
Deep Forest Sciences, California, USA
Bharath Ramsundar
Bharath Ramsundar
Deep Forest Sciences, DeepChem, previously Computable, previously Stanford
Differentiable PhysicsDrug DiscoveryMachine LearningCryptographyCryptoeconomics