🤖 AI Summary
This work addresses the challenge of distinguishing genuine causal relationships from mere correlations or temporal patterns in manufacturing system alarm logs. To this end, it proposes the LMT framework, which, for the first time, integrates semantic causal priors extracted by large language models with timestamp-based Poisson process likelihoods within a Bayesian causal discovery framework. By jointly modeling textual semantics and temporal statistical evidence, the method infers causal graphs that are both interpretable and well-supported by data. Extensive experiments demonstrate its superior performance across diverse simulated scenarios, particularly outperforming text-only or time-series-only baselines in low-data regimes with sparse alarm events.
📝 Abstract
Textual event records, such as alarm logs, have become an increasingly common data source in engineering and manufacturing systems. Beyond identifying correlations or recurring patterns, engineers are often interested in understanding which types of events causally trigger or influence other events during system operation. Textual event descriptions may contain semantic clues about such causal relationships, and recent large language models (LLMs) provide a promising tool for extracting these signals. However, relying solely on LLM-encoded textual information is insufficient for accurate causal discovery, since semantic patterns do not directly reveal causal mechanisms and may confuse causation with correlation or frequent sequential patterns. To address these challenges, we propose \textbf{LMT}, a Bayesian causal discovery framework for engineering event data that jointly leverages textual descriptions and timestamps. Specifically, LMT first uses LLMs to extract semantic causal signals from event descriptions and constructs a prior distribution over causal graphs among event types or event clusters. It then incorporates temporal evidence through a Poisson-process-based likelihood, allowing the LLM-informed prior to be refined by timestamp-based statistical evidence. By integrating the textual and temporal information, LMT produces a causal graph that is both interpretable and data-supported. Simulation studies show that the proposed framework is effective across different settings and is especially advantageous in small-sample alarm-event scenarios.