🤖 AI Summary
This paper addresses the insufficient generalization performance of Least Squares Support Vector Machines (LSSVM) in high-dimensional, large-sample regimes. We propose a theoretical analysis and optimization framework based on Bootstrap ensemble learning. For the first time, we systematically incorporate Random Matrix Theory into the high-dimensional asymptotic analysis of Bootstrap-LSSVM, establishing a generalization error characterization model under joint growth of sample size $n$ and dimension $p$, where $p/n o gamma$. This analysis reveals convergence properties and phase-transition phenomena. Leveraging these insights, we derive closed-form, adaptive selection rules for both the number of bootstrap subsets and the regularization parameter. Combining theoretical derivation with extensive numerical experiments—across multiple synthetic and real-world datasets—we demonstrate that our strategy improves classification accuracy by 3.2–7.8%, significantly outperforming empirical hyperparameter tuning methods.
📝 Abstract
Bootstrap methods have long been a cornerstone of ensemble learning in machine learning. This paper presents a theoretical analysis of bootstrap techniques applied to the Least Square Support Vector Machine (LSSVM) ensemble in the context of large and growing sample sizes and feature dimensionalities. Leveraging tools from Random Matrix Theory, we investigate the performance of this classifier that aggregates decision functions from multiple weak classifiers, each trained on different subsets of the data. We provide insights into the use of bootstrap methods in high-dimensional settings, enhancing our understanding of their impact. Based on these findings, we propose strategies to select the number of subsets and the regularization parameter that maximize the performance of the LSSVM. Empirical experiments on synthetic and real-world datasets validate our theoretical results.