🤖 AI Summary
To address the prohibitively high computational cost of retraining classical decision trees on streaming incremental data as sample size grows, this paper proposes a quantized decision tree retraining algorithm for regression and binary classification. Methodologically, it achieves, for the first time, logarithmic-time-complexity decision tree retraining; introduces QRAM-based quantum supervised clustering to adaptively determine segmentation anchor points for piecewise-linear hyperplane partitioning; and employs a piecewise-linear hyperplane space partitioning strategy. Theoretical analysis rigorously establishes its asymptotic speedup. Empirical evaluation demonstrates that the proposed method significantly outperforms state-of-the-art classical approaches in retraining speed while maintaining comparable prediction accuracy. Extensive experiments across multiple benchmark datasets validate both the practical utility and robustness of the quantum-accelerated framework.
📝 Abstract
Decision trees are widely adopted machine learning models due to their simplicity and explainability. However, as training data size grows, standard methods become increasingly slow, scaling polynomially with the number of training examples. In this work, we introduce Des-q, a novel quantum algorithm to construct and retrain decision trees for regression and binary classification tasks. Assuming the data stream produces small, periodic increments of new training examples, Des-q significantly reduces the tree retraining time. Des-q achieves a logarithmic complexity in the combined total number of old and new examples, even accounting for the time needed to load the new samples into quantum-accessible memory. Our approach to grow the tree from any given node involves performing piecewise linear splits to generate multiple hyperplanes, thus partitioning the input feature space into distinct regions. To determine the suitable anchor points for these splits, we develop an efficient quantum-supervised clustering method, building upon the q-means algorithm introduced by Kerenidis et al. We benchmark the simulated version of Des-q against the state-of-the-art classical methods on multiple data sets and observe that our algorithm exhibits similar performance to the state-of-the-art decision trees while significantly speeding up the periodic tree retraining.