π€ AI Summary
To address the dual requirements of interpretability and robustness in high-stakes domains (e.g., healthcare, aviation), this paper proposes LCENβa novel nonlinear sparse modeling framework integrating LASSO pruning with elastic net regularization. LCEN uniquely couples gradient-guided pruning with physics-informed validation, preserving linear interpretability while achieving strong nonlinear approximation capability. It is specifically designed to handle noise, multicollinearity, and small-sample regimes. Experiments demonstrate that LCEN reduces feature counts by 10.8Γ versus fully connected models and by 8.1Γ versus standard elastic net, while matching or exceeding the predictive accuracy of artificial neural networks. Crucially, LCEN automatically recovers known physical laws from data, enhancing model trustworthiness and practical utility in safety-critical applications.
π Abstract
Interpretable architectures can have advantages over black-box architectures, and interpretability is essential for the application of machine learning in critical settings, such as aviation or medicine. However, the simplest, most commonly used interpretable architectures, such as LASSO or elastic net (EN), are limited to linear predictions and have poor feature selection capabilities. In this work, we introduce the LASSO-Clip-EN (LCEN) algorithm for the creation of nonlinear, interpretable machine learning models. LCEN is tested on a wide variety of artificial and empirical datasets, frequently creating more accurate, sparser models than other architectures, including those for building sparse, nonlinear models. LCEN is robust against many issues typically present in datasets and modeling, including noise, multicollinearity, data scarcity, and hyperparameter variance. LCEN is also able to rediscover multiple physical laws from empirical data and, for processes with no known physical laws, LCEN achieves better results than many other dense and sparse methods -- including using 10.8-fold fewer features than dense methods and 8.1-fold fewer features than EN on one dataset, and is comparable to or better than ANNs on multiple datasets.