A Public Dataset For the ZKsync Rollup

πŸ“… 2024-07-26
πŸ›οΈ arXiv.org
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
High acquisition costs and poor data quality for Layer-2 (L2) blockchain data severely hinder data-driven research in emerging ecosystems such as ZKsync. To address this, we construct and open-source the first high-quality, structured dataset comprehensively capturing one year of on-chain activity on ZKsync Eraβ€”filling a critical gap in publicly available, high-fidelity L2 chain data. Leveraging an archival node, our pipeline employs batch synchronization, transaction decoding, state snapshot extraction, and schema normalization to produce a standardized, Parquet-formatted dataset optimized for SQL querying. It comprises over 120 million transactions, tens of millions of addresses, and complete smart contract deployment records. We also release a fully reproducible data extraction workflow and analytical templates. This dataset has already enabled cutting-edge research in MEV modeling, gas optimization, and zk-SNARK verification pattern analysis.

Technology Category

Application Category

πŸ“ Abstract
Despite blockchain data being publicly available, practical challenges and high costs often hinder its effective use by researchers, thus limiting data-driven research and exploration in the blockchain space. This is especially true when it comes to Layer-2 (L2) ecosystems, and ZKsync, in particular. To address these issues, we have curated a dataset from 1 year of activity extracted from a ZKsync Era archive node and made it freely available to external parties. We provide details on this dataset and how it was created, showcase a few example analyses that can be performed with it, and discuss some future research directions.
Problem

Research questions and friction points this paper is trying to address.

High costs hinder blockchain data use
Limited research in Layer-2 ecosystems
ZKsync dataset addresses accessibility issues
Innovation

Methods, ideas, or system contributions that make the work stand out.

Public ZKsync dataset creation
One-year activity data extraction
Freely available for research
πŸ”Ž Similar Papers
No similar papers found.