🤖 AI Summary
Prior research on I2P has predominantly focused on its application layer (e.g., dark web services), leaving a critical gap in systematic, empirical analysis of its network-layer architecture and publicly available measurement datasets.
Method: This paper presents the first large-scale network-layer measurement of I2P, introducing SWARM-I2P—a distributed probing framework that integrates dynamic port mapping, netDb parsing, console querying, and passive traffic monitoring.
Contribution/Results: We collect and analyze data from over 50,000 I2P routers—including 2,077 FastSet nodes and 2,331 high-capacity routers—along with 4.22 million connection records and over one million packets. The dataset characterizes geolocation (3,444 nodes across 92 countries), bandwidth, latency, uptime, and traffic patterns. To our knowledge, this is the first empirically derived, publicly documented I2P network-layer dataset, enabling rigorous tunnel optimization, resilience assessment, and adversarial modeling—thereby bridging a fundamental gap in anonymous network infrastructure research.
📝 Abstract
This article presents a novel dataset focusing on the network layer of the Invisible Internet Project (I2P), where prior research has predominantly examined application layers like the dark web. Data was collected through the SWARM- I2P framework, deploying I2P routers as mapping agents, utilizing dynamic port mapping (30000-50000 range). The dataset documents over 50,000 nodes, including 2,077 FastSet nodes and 2,331 high-capacity nodes characterized by bandwidth, latency (mean 121.21ms +- 48.50), and uptime metrics. It contains 1,997 traffic records (1,003,032 packets/bytes) and 4,222,793 records (2,147,585,625 packets/bytes), with geographic distributions for 3,444 peers showing capacity metrics (mean 8.57 +- 1.20). Collection methods included router console queries (127.0.0.1:port/tunnels), netDb analysis, and passive monitoring, with anonymized identifiers. Data is structured in CSV/TXT formats (Zenodo) with collection scripts (GitHub). Potential applications include tunnel peer selection analysis, anonymity network resilience studies, and adversarial modelling.