🤖 AI Summary
This study addresses the limitations of existing software security analysis tools, which typically assess vulnerabilities in isolation and struggle to identify cascading vulnerability chains across interdependent components, further compounded by significant discrepancies among Software Bill of Materials (SBOM) tool outputs. To overcome these challenges, this work formulates cascading vulnerability discovery as a link prediction problem between CVE pairs and constructs a heterogeneous graph from SBOM data encompassing components, dependencies, vulnerabilities, and associated weaknesses. The authors propose a graph representation learning approach that integrates a Heterogeneous Graph Attention Network (HGAT) with a multilayer perceptron to enable component-level vulnerability correlation analysis. Experimental results demonstrate that the proposed HGAT-based classifier achieves an accuracy of 91.03% and an F1 score of 74.02%, effectively surpassing the constraints of conventional standalone scanning methods and substantially enhancing the detection of multi-step cascading vulnerability paths.
📝 Abstract
Most of the current software security analysis tools assess vulnerabilities in isolation. However, sophisticated software supply chain security threats often stem from cascaded vulnerability and security weakness chains that span dependent components. Moreover, although the adoption of Software Bills of Materials (SBOMs) has been accelerating, downstream vulnerability findings vary substantially across SBOM generators and analysis tools. We propose a novel approach to SBOM-driven security analysis methods and tools. We model vulnerability relationships over dependency structure rather than treating scanner outputs as independent records. We represent enriched SBOMs as heterogeneous graphs with nodes being the SBOM components and dependencies, the known software vulnerabilities, and the known software security weaknesses. We then train a Heterogeneous Graph Attention Network (HGAT) to predict whether a component is associated with at least one known vulnerability. Since documented multi-vulnerability chains are scarce, we model cascade discovery as a link prediction problem over CVE pairs using a multi-layer perceptron neural network. This way, we produce ranked candidate links that can be composed into multi-step paths. The HGAT component classifier achieves an Accuracy of 91.03% and an F1-score of 74.02%.