Analyzing the Availability of E-Mail Addresses for PyPI Libraries

📅 2026-01-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the sustainability of open-source Python libraries, which hinges critically on maintainer reachability, yet the availability and validity of associated email addresses have not been systematically assessed. We present the first large-scale empirical analysis of email contacts across 686,034 packages on PyPI and their corresponding GitHub repositories, integrating web scraping, email validation, and dependency graph construction to quantify the distribution and coverage of contact information. Our findings reveal that 81.6% of packages contain at least one valid email address, and 97.7% of transitive dependencies are reachable through valid contact information. Nevertheless, we identify over 698,000 invalid email records, highlighting both the current state of maintainer accessibility in the ecosystem and significant opportunities for improvement.

Technology Category

Application Category

📝 Abstract
Open Source Software (OSS) libraries form the backbone of modern software systems, yet their long-term sustainability often depends on maintainers being reachable for support, coordination, and security reporting. In this paper, we empirically analyze the availability of contact information - specifically e-mail addresses - across 686,034 Python libraries on the Python Package Index (PyPI) and their associated GitHub repositories. We examine how and where maintainers provide this information, assess its validity, and explore coverage across individual libraries and their dependency chains. Our findings show that 81.6% of libraries include at least one valid e-mail address, with PyPI serving as the primary source (79.5%). When analyzing dependency chains, we observe that up to 97.8% of direct and 97.7% of transitive dependencies provide valid contact information. At the same time, we identify over 698,000 invalid entries, primarily due to missing fields. These results demonstrate strong maintainer reachability across the ecosystem, while highlighting opportunities for improvement - such as offering clearer guidance to maintainers during the packaging process and introducing opt-in validation mechanisms for existing e-mail addresses.
Problem

Research questions and friction points this paper is trying to address.

email availability
open source software
maintainer reachability
PyPI
contact information
Innovation

Methods, ideas, or system contributions that make the work stand out.

email availability
PyPI
open source sustainability
dependency chain analysis
maintainer reachability
🔎 Similar Papers
2024-04-26International Conference on Evaluation & Assessment in Software EngineeringCitations: 3