π€ AI Summary
In high-stakes applications requiring simultaneous modeling of multiple conditional quantiles while preserving interpretability, this paper proposes Symbolic Quantile Regression (SQR)βthe first framework extending symbolic regression to conditional quantile modeling. SQR employs a tree-based evolutionary algorithm to directly synthesize closed-form quantile functions and optimizes them via quantile loss, enabling distinct characterization of covariate effects on both distributional centers and tails. Its core innovation lies in the principled integration of symbolic regression and quantile regression, jointly ensuring global interpretability, sensitivity to extreme values, and cross-quantile comparability of effect estimates. Experiments demonstrate that SQR outperforms conventional interpretable models across multiple benchmarks, matches the predictive accuracy of state-of-the-art black-box methods, and successfully identifies heterogeneous influential factors at different quantiles in an aviation fuel demand forecasting case study.
π Abstract
Symbolic Regression (SR) is a well-established framework for generating interpretable or white-box predictive models. Although SR has been successfully applied to create interpretable estimates of the average of the outcome, it is currently not well understood how it can be used to estimate the relationship between variables at other points in the distribution of the target variable. Such estimates of e.g. the median or an extreme value provide a fuller picture of how predictive variables affect the outcome and are necessary in high-stakes, safety-critical application domains. This study introduces Symbolic Quantile Regression (SQR), an approach to predict conditional quantiles with SR. In an extensive evaluation, we find that SQR outperforms transparent models and performs comparably to a strong black-box baseline without compromising transparency. We also show how SQR can be used to explain differences in the target distribution by comparing models that predict extreme and central outcomes in an airline fuel usage case study. We conclude that SQR is suitable for predicting conditional quantiles and understanding interesting feature influences at varying quantiles.