š¤ AI Summary
Kernelized quadratic SVMs suffer from severe overfitting and poor interpretability in high-dimensional settings due to quadratic growth in the number of quadratic-term parameters.
Method: This paper introduces, for the first time, an Lā-norm sparsity constraint on quadratic coefficients to explicitly limit the number of nonzero interaction terms. To address the resulting nonconvex, discontinuous optimization problem, we propose a penalty decomposition algorithm with guaranteed convergence; its subproblems admit closed-form solutions or efficient dual acceleration.
Contributions/Results: Theoretical analysis and extensive experiments on real-world datasets demonstrate that our method significantly mitigates overfitting, improves generalization performance, and enhances feature (interaction) selection accuracy. Open-source implementation confirms robustness and computational efficiency. This work establishes a new paradigm for high-dimensional quadratic classification that jointly ensures statistical efficiency and model interpretability.
š Abstract
Kernel-free quadratic surface support vector machine (SVM) models have gained significant attention in machine learning. However, introducing a quadratic classifier increases the model's complexity by quadratically expanding the number of parameters relative to the dimensionality of the data, exacerbating overfitting. To address this, we propose sparse $ell_0$-norm based Kernel-free quadratic surface SVMs, designed to mitigate overfitting and enhance interpretability. Given the intractable nature of these models, we present a penalty decomposition algorithm to efficiently obtain first-order optimality points. Our analysis shows that the subproblems in this framework either admit closed-form solutions or can leverage duality theory to improve computational efficiency. Through empirical evaluations on real-world datasets, we demonstrate the efficacy and robustness of our approach, showcasing its potential to advance Kernel-free quadratic surface SVMs in practical applications while addressing overfitting concerns. All the implemented models and experiment codes are available at url{https://github.com/raminzandvakili/L0-QSVM}.