🤖 AI Summary
Although large language models can generate fluent explanations, they struggle to reliably uncover true causal mechanisms, as predictive success does not guarantee mechanistic correctness—a serious issue of mechanistic unidentifiability. This work systematically identifies this risk for the first time and introduces a novel paradigm termed “mechanistic machine learning,” which prioritizes the identification of verifiable structural relationships over reliance on black-box models for scientific discovery. Through theoretical analysis and philosophical argumentation, complemented by an examination of mechanism equivalence classes under high-dimensional observations, the paper establishes methodological principles to ensure mechanistic identifiability. These foundations aim to guide the development of large language model workflows that genuinely support scientific inquiry.
📝 Abstract
Modern Machine Learning (ML) and Artificial Intelligence (AI) models, especially large language models (LLMs), are increasingly used to generate scientific hypotheses and mechanistic explanations from observational data. This position paper argues that in the high-dimensional proxy regimes where modern ML excels, mechanistic learning is generically underdetermined: many incompatible mechanisms induce essentially the same observational relationships on the support of the data, so predictive success and coherent explanations are insufficient evidence of mechanism discovery. This underdetermination becomes uniquely hazardous with large language models (LLMs), which tend to collapse large equivalence classes of explanations into a single fluent narrative. This paper proposes concrete standards for ``mechanistic ML,'' and argues these norms are necessary if LLM-centered workflows are to support science rather than merely simulate it.