The Popularity Hypothesis in Software Security: A Large-Scale Replication with PHP Packages

📅 2025-02-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates whether a statistically significant association exists between popularity and security in PHP open-source packages. Method: Leveraging a large-scale empirical analysis of nearly 400,000 PHP packages, the work integrates version history parsing, CVE vulnerability mapping, and popularity metrics (e.g., download counts, GitHub stars), employing non-parametric statistical tests—including the Mann–Whitney U test—to assess significance. Contribution/Results: It is the first study to systematically replicate and validate the “popularity–vulnerability” hypothesis within the PHP ecosystem: packages with known CVEs exhibit significantly higher average popularity than those without (p < 0.001). Beyond providing robust empirical support for a long-standing conjecture, this work strengthens the empirical foundation of software security knowledge and represents the first large-scale, language-specific validation of the vulnerability–popularity correlation in PHP.

Technology Category

Application Category

📝 Abstract
There has been a long-standing hypothesis that a software's popularity is related to its security or insecurity in both research and popular discourse. There are also a few empirical studies that have examined the hypothesis, either explicitly or implicitly. The present work continues with and contributes to this research with a replication-motivated large-scale analysis of software written in the PHP programming language. The dataset examined contains nearly four hundred thousand open source software packages written in PHP. According to the results based on reported security vulnerabilities, the hypothesis does holds; packages having been affected by vulnerabilities over their release histories are generally more popular than packages without having been affected by a single vulnerability. With this replication results, the paper contributes to the efforts to strengthen the empirical knowledge base in cyber and software security.
Problem

Research questions and friction points this paper is trying to address.

Examines software popularity-security link
Large-scale PHP package analysis
Validates vulnerability-popularity correlation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Large-scale PHP package analysis
Security vulnerability popularity correlation
Empirical replication in software security
🔎 Similar Papers
No similar papers found.