๐ค AI Summary
This work addresses systematic challenges in applying large language models (LLMs) to causal inferenceโcritical in domains such as medicine and economics. Methodologically, it integrates prompt engineering, chain-of-thought reasoning, retrieval-augmented generation (RAG), and structured causal modeling, while incorporating domain-specific knowledge injection and causal validation mechanisms. The paper proposes the first unified LLM-powered causal inference framework, explicitly delineating task hierarchies (association identification โ causal discovery โ effect estimation โ counterfactual reasoning), evaluation paradigms, and knowledge integration pathways. Based on a comprehensive review and empirical analysis of over 50 state-of-the-art studies, it establishes principled guidelines for scalable benchmark design. Results reveal fundamental capability boundaries of LLMs in causal discovery, effect estimation, and counterfactual generation, identifying three core bottlenecks: limited generalizability across causal structures, insufficient interpretability of inferred mechanisms, and poor robustness to interventions.
๐ Abstract
Causal inference has been a pivotal challenge across diverse domains such as medicine and economics, demanding a complicated integration of human knowledge, mathematical reasoning, and data mining capabilities. Recent advancements in natural language processing (NLP), particularly with the advent of large language models (LLMs), have introduced promising opportunities for traditional causal inference tasks. This paper reviews recent progress in applying LLMs to causal inference, encompassing various tasks spanning different levels of causation. We summarize the main causal problems and approaches, and present a comparison of their evaluation results in different causal scenarios. Furthermore, we discuss key findings and outline directions for future research, underscoring the potential implications of integrating LLMs in advancing causal inference methodologies.