Splitting criteria for ordinal decision trees: an experimental study

📅 2024-12-18
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the performance degradation of conventional decision trees in ordinal classification (OC) tasks, stemming from their neglect of label-order relationships. To remedy this, we systematically design and evaluate order-aware splitting criteria. We introduce a unified notation and conduct the first large-scale empirical comparison—across a benchmark comprising 45 publicly available OC datasets—of several ordinal-specific criteria, including Ordinal Gini (OGini), Weighted Information Gain, and Ranking Impurity. Results demonstrate that OGini significantly outperforms nominal Gini and standard information gain, establishing it as the current state-of-the-art ordinal splitting criterion. Our approach integrates seamlessly into standard decision tree frameworks and employs ordinal-sensitive evaluation metrics such as MAE and ORMSE. All code, datasets, and experimental results are fully open-sourced to foster reproducible research in ordinal learning.

Technology Category

Application Category

📝 Abstract
Ordinal Classification (OC) is a machine learning field that addresses classification tasks where the labels exhibit a natural order. Unlike nominal classification, which treats all classes as equally distinct, OC takes the ordinal relationship into account, producing more accurate and relevant results. This is particularly critical in applications where the magnitude of classification errors has implications. Despite this, OC problems are often tackled using nominal methods, leading to suboptimal solutions. Although decision trees are one of the most popular classification approaches, ordinal tree-based approaches have received less attention when compared to other classifiers. This work conducts an experimental study of tree-based methodologies specifically designed to capture ordinal relationships. A comprehensive survey of ordinal splitting criteria is provided, standardising the notations used in the literature for clarity. Three ordinal splitting criteria, Ordinal Gini (OGini), Weighted Information Gain (WIG), and Ranking Impurity (RI), are compared to the nominal counterparts of the first two (Gini and information gain), by incorporating them into a decision tree classifier. An extensive repository considering 45 publicly available OC datasets is presented, supporting the first experimental comparison of ordinal and nominal splitting criteria using well-known OC evaluation metrics. Statistical analysis of the results highlights OGini as the most effective ordinal splitting criterion to date. Source code, datasets, and results are made available to the research community.
Problem

Research questions and friction points this paper is trying to address.

Develops ordinal decision tree criteria
Compares ordinal and nominal splitting methods
Identifies most effective ordinal splitting criterion
Innovation

Methods, ideas, or system contributions that make the work stand out.

Ordinal decision tree criteria
Comparison of ordinal splitting methods
OGini most effective criterion
🔎 Similar Papers
No similar papers found.
R
Rafael Ayll'on-Gavil'an
Department of Clinical-Epidemiological Research in Primary Care, IMIBIC, Spain
F
Francisco Jos'e Mart'inez-Estudillo
Department of Quantitative Methods, Universidad Loyola Andaluc'ıa, Spain
David Guijo-Rubio
David Guijo-Rubio
Assistant Professor, University of Córdoba
time series machine learningordinal classification
C
C'esar Herv'as-Mart'inez
Department of Computer Science and Numerical Analysis, University of Cordoba, Spain
P
Pedro-Antonio Guti'errez
Department of Computer Science and Numerical Analysis, University of Cordoba, Spain