Software Bills of Materials in Maven Central

📅 2025-01-23

📈 Citations: 0

✨ Influential: 0

career value

186K/year

🤖 AI Summary

This study presents the first systematic empirical investigation of Software Bill of Materials (SBOM) publishing practices among developers on Maven Central. Addressing the lack of large-scale, repository-level SBOM analysis, we propose a warehouse-oriented SBOM discovery and dependency graph augmentation method: leveraging the Goblin framework to construct Maven dependency graphs, and integrating the Weaver module for automated SBOM (SPDX/CycloneDX) parsing, graph traversal-based sampling, and multi-source data fusion. From a 10% stratified sample of repository nodes, we collected 14,071 SBOMs covering 7,290 package versions, establishing the first publicly available Maven SBOM dataset. Results reveal critically low SBOM adoption rates and severe format fragmentation. Key contributions include: (1) the first empirically grounded, multi-source SBOM dataset; (2) a scalable, package-level SBOM discovery and graph-augmentation framework; and (3) evidence-based insights for enhancing transparency in open-source software supply chains.

Technology Category

Application Category

📝 Abstract

Software Bills of Materials (SBOMs) are essential to ensure the transparency and integrity of the software supply chain. There is a growing body of work that investigates the accuracy of SBOM generation tools and the challenges for producing complete SBOMs. Yet, there is little knowledge about how developers distribute SBOMs. In this work, we mine SBOMs from Maven Central to assess the extent to which developers publish SBOMs along with the artifacts. We develop our work on top of the Goblin framework, which consists of a Maven Central dependency graph and a Weaver that allows augmenting the dependency graph with additional data. For this study, we select a sample of 10% of release nodes from the Maven Central dependency graph and collected 14,071 SBOMs from 7,290 package releases. We then augment the Maven Central dependency graph with the collected SBOMs. We present our methodology to mine SBOMs, as well as novel insights about SBOM publication. Our dataset is the first set of SBOMs collected from a package registry. We make it available as a standalone dataset, which can be used for future research about SBOMs and package distribution.

Problem

Research questions and friction points this paper is trying to address.

Maven Central

Software Bill of Materials (SBOMs)

Sharing Patterns

Innovation

Methods, ideas, or system contributions that make the work stand out.

SBOM Analysis

Software Dependency Mapping

Data Sharing Insights

🔎 Similar Papers

No similar papers found.