🤖 AI Summary
Existing STL specification inference methods lack statistical confidence guarantees, undermining the reliability of interpretable rules. This paper introduces conformal prediction to STL inference for the first time, proposing an end-to-end differentiable framework: a robust nonconformity scoring function is designed, and conformal loss is integrated into training; smoothed STL semantics and gradient-based optimization jointly optimize both formula structure and prediction set quality. The method ensures statistical validity—achieving guaranteed (1−α) coverage—while significantly reducing prediction set size, improving coverage accuracy, and lowering misclassification rates. Experiments demonstrate consistent superiority over state-of-the-art approaches across multiple benchmark tasks. To our knowledge, this is the first deep learning framework for time-series interpretable reasoning that provides formal, distribution-free confidence guarantees.
📝 Abstract
Signal Temporal Logic (STL) inference seeks to extract human-interpretable rules from time-series data, but existing methods lack formal confidence guarantees for the inferred rules. Conformal prediction (CP) is a technique that can provide statistical correctness guarantees, but is typically applied as a post-training wrapper without improving model learning. Instead, we introduce an end-to-end differentiable CP framework for STL inference that enhances both reliability and interpretability of the resulting formulas. We introduce a robustness-based nonconformity score, embed a smooth CP layer directly into training, and employ a new loss function that simultaneously optimizes inference accuracy and CP prediction sets with a single term. Following training, an exact CP procedure delivers statistical guarantees for the learned STL formulas. Experiments on benchmark time-series tasks show that our approach reduces uncertainty in predictions (i.e., it achieves high coverage while reducing prediction set size), and improves accuracy (i.e., the number of misclassifications when using a fixed threshold) over state-of-the-art baselines.