🤖 AI Summary
Large language models (LLMs) exhibit limited reliability, interpretability, and theoretical consistency when applied to physics research. Method: This paper proposes a three-module collaborative framework: (1) a reasoning module integrating physics-informed prompt engineering and symbolic reasoning; (2) an explanation module introducing a novel multi-agent, structured explanation mechanism tailored to physical sciences—encompassing summarization, modeling, user interface generation, and test specification; and (3) an interaction module enabling traceable human–AI co-verification. Contribution/Results: The framework achieves the first end-to-end transformation of LLM outputs into verifiable scientific models, tightly coupling eXplainable AI (XAI) with domain-specific scientific workflows. Empirical evaluation demonstrates substantial improvements in output transparency, theoretical consistency with physical principles, and empirical falsifiability. This work establishes a new paradigm for trustworthy, AI-augmented scientific discovery.
📝 Abstract
Large Language Models (LLMs) are playing an expanding role in physics research by enhancing reasoning, symbolic manipulation, and numerical computation. However, ensuring the reliability and interpretability of their outputs remains a significant challenge. In our framework, we conceptualize the collaboration between AI and human scientists as a dynamic interplay among three modules: the reasoning module, the interpretation module, and the AI-scientist interaction module. Recognizing that effective physics reasoning demands rigorous logical consistency, quantitative precision, and deep integration with established theoretical models, we introduce the interpretation module to improve the understanding of AI-generated outputs, which is not previously explored in the literature. This module comprises multiple specialized agents, including summarizers, model builders, UI builders, and testers, which collaboratively structure LLM outputs within a physically grounded framework, by constructing a more interpretable science model. A case study demonstrates that our approach enhances transparency, facilitates validation, and strengthens AI-augmented reasoning in scientific discovery.