๐ค AI Summary
Addressing the longstanding trade-off between predictive accuracy and model interpretability in machine learning, this paper proposes FOLD-SE, a novel rule-learning algorithm designed for transparent, high-performance classification. FOLD-SE extends the FOLD framework with enhanced rule pruning and multi-class generalization mechanisms. Empirical evaluation across multiple benchmark datasets demonstrates that, in binary classification, FOLD-SE significantly outperforms FOLD-R++ in both accuracy and macro-F1 score; in multiclass settings, it achieves accuracy comparable to XGBoost while accelerating inference by an order of magnitude. Crucially, FOLD-SE generates compact, semantically meaningful, and fully traceable rule setsโovercoming key limitations of black-box models. Performance is rigorously assessed using accuracy, macro-F1, and processing time. Results confirm that FOLD-SE simultaneously delivers state-of-the-art predictive performance and unprecedented interpretability, establishing a new paradigm for trustworthy, efficient, and human-understandable classification modeling.
๐ Abstract
Recently, the demand for Machine Learning (ML) models that can balance accuracy, efficiency, and interpreability has grown significantly. Traditionally, there has been a tradeoff between accuracy and explainability in predictive models, with models such as Neural Networks achieving high accuracy on complex datasets while sacrificing internal transparency. As such, new rule-based algorithms such as FOLD-SE have been developed that provide tangible justification for predictions in the form of interpretable rule sets. The primary objective of this study was to compare FOLD-SE and FOLD-R++, both rule-based classifiers, in binary classification and evaluate how FOLD-SE performs against XGBoost, a widely used ensemble classifier, when applied to multi-category classification. We hypothesized that because FOLD-SE can generate a condensed rule set in a more explainable manner, it would lose upwards of an average of 3 percent in accuracy and F1 score when compared with XGBoost and FOLD-R++ in multiclass and binary classification, respectively. The research used data collections for classification, with accuracy, F1 scores, and processing time as the primary performance measures. Outcomes show that FOLD-SE is superior to FOLD-R++ in terms of binary classification by offering fewer rules but losing a minor percentage of accuracy and efficiency in processing time; in tasks that involve multi-category classifications, FOLD-SE is more precise and far more efficient compared to XGBoost, in addition to generating a comprehensible rule set. The results point out that FOLD-SE is a better choice for both binary tasks and classifications with multiple categories. Therefore, these results demonstrate that rule-based approaches like FOLD-SE can bridge the gap between explainability and performance, highlighting their potential as viable alternatives to black-box models in diverse classification tasks.