Identifying Risk Variables From ESG Raw Data Using A Hierarchical Variable Selection Algorithm

📅 2025-08-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
ESG raw data exhibit a hierarchical tree structure, and the number of variables vastly exceeds the sample size—posing challenges for high-dimensional, sparse modeling. Conventional approaches aggregate ESG indicators into composite scores, incurring substantial information loss and hindering interpretable risk analysis. Method: We propose Hierarchical Variable Selection (HVS), a novel method that jointly enforces tree-structured regularization and sparsity to directly model the intrinsic hierarchical relationships among ESG variables. HVS quantifies corporate risk via log-volatility and identifies key ESG drivers without pre-aggregation. Contribution/Results: Empirical evaluation demonstrates that HVS significantly outperforms benchmark models relying on aggregated ESG scores—achieving superior risk explanatory power and forecasting accuracy with fewer selected variables. By preserving structural fidelity and enabling sparse, interpretable estimation, HVS establishes a new paradigm for modeling high-dimensional, structured non-financial data.

Technology Category

Application Category

📝 Abstract
Environmental, Social, and Governance (ESG) factors aim to provide non-financial insights into corporations. In this study, we investigate whether we can extract relevant ESG variables to assess corporate risk, as measured by logarithmic volatility. We propose a novel Hierarchical Variable Selection (HVS) algorithm to identify a parsimonious set of variables from raw data that are most relevant to risk. HVS is specifically designed for ESG datasets characterized by a tree structure with significantly more variables than observations. Our findings demonstrate that HVS achieves significantly higher performance than models using pre-aggregated ESG scores. Furthermore, when compared with traditional variable selection methods, HVS achieves superior explanatory power using a more parsimonious set of ESG variables. We illustrate the methodology using company data from various sectors of the US economy.
Problem

Research questions and friction points this paper is trying to address.

Identifying relevant ESG variables for corporate risk assessment
Proposing a hierarchical algorithm for variable selection
Handling datasets with more variables than observations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical Variable Selection algorithm for ESG
Identifies parsimonious risk-relevant variables from raw data
Outperforms traditional methods and pre-aggregated scores
🔎 Similar Papers
No similar papers found.