🤖 AI Summary
The rapid expansion of open-source software (OSS) ecosystems has exacerbated security risks, yet the imbalance between accelerating vulnerability emergence and lagging remediation remains unquantified. To address this, we systematically analyze 31,267 CVE records, integrating heterogeneous data from GitHub Security Advisories, Snyk.io, and package registries to construct a cross-language, cross-platform vulnerability evolution analysis framework. Our findings reveal: (1) an annual vulnerability growth rate of 98%, significantly outpacing package growth; (2) a 85% increase in average vulnerability lifecycle duration; (3) seven CWE types accounting for over 50% of all vulnerabilities; and (4) malicious packages constituting 49% and 14% of total packages in NPM and PyPI, respectively. Crucially, we provide the first empirical evidence that malicious packages are a dominant root cause of vulnerabilities—an emergent phenomenon previously undocumented. Based on these insights, we propose a prioritized response checklist and an empirically grounded benchmark for OSS security governance.
📝 Abstract
Open-source software (OSS) has become increasingly more popular across different domains. However, this rapid development and widespread adoption come with a security cost. The growing complexity and openness of OSS ecosystems have led to increased exposure to vulnerabilities and attack surfaces. This paper investigates the trends and patterns of reported vulnerabilities within OSS platforms, focusing on the implications of these findings for security practices. To understand the dynamics of OSS vulnerabilities, we analyze a comprehensive dataset comprising 31,267 unique vulnerability reports from GitHub's advisory database and Snyk.io, belonging to 14,675 packages across 10 programming languages. Our analysis reveals a significant surge in reported vulnerabilities, increasing at an annual rate of 98%, far outpacing the 25% average annual growth in the number of open-source software (OSS) packages. Additionally, we observe an 85% increase in the average lifespan of vulnerabilities across ecosystems during the studied period, indicating a potential decline in security. We identify the most prevalent Common Weakness Enumerations (CWEs) across programming languages and find that, on average, just seven CWEs are responsible for over 50% of all reported vulnerabilities. We further examine these commonly observed CWEs and highlight ecosystem-specific trends. Notably, we find that vulnerabilities associated with intentionally malicious packages comprise 49% of reports in the NPM ecosystem and 14% in PyPI, an alarming indication of targeted attacks within package repositories. We conclude with an in-depth discussion of the characteristics and attack vectors associated with these malicious packages.