🤖 AI Summary
Existing research is hindered by the absence of standardized, cross-chain, event-level datasets, impeding empirical analysis of decentralized lending protocols such as Aave V3. To address this, we construct the first Aave V3 event-level data infrastructure covering six EVM-compatible blockchains. Our pipeline systematically ingests and decodes eight core on-chain events, producing over 50 million structured records—each annotated with USD valuation, block timestamp, and chain identifier. Methodologically, we introduce cross-chain event alignment, full-chain synchronous decoding, and real-time foreign exchange rate mapping. We also design an open-source Python pipeline supporting dynamic batch processing, automatic sharding (≤1M rows/file), and multi-chain temporal normalization. The resulting dataset ensures temporal rigor, cryptographic verifiability, and full public availability. It enables, for the first time, reproducible research on liquid capital flow tracking, liquidation risk modeling, and cross-chain user behavior analysis—providing foundational data for empirical studies of interest rate mechanisms and systemic risk.
📝 Abstract
Decentralized lending protocols, exemplified by Aave V3, have transformed financial intermediation by enabling permissionless, multi-chain borrowing and lending without intermediaries. Despite managing over $10 billion in total value locked, empirical research remains severely constrained by the lack of standardized, cross-chain event-level datasets.
This paper introduces the first comprehensive, event-driven data infrastructure for Aave V3 spanning six major EVM-compatible chains (Ethereum, Arbitrum, Optimism, Polygon, Avalanche, and Base) from respective deployment blocks through October 2025. We collect and fully decode eight core event types -- Supply, Borrow, Withdraw, Repay, LiquidationCall, FlashLoan, ReserveDataUpdated, and MintedToTreasury -- producing over 50 million structured records enriched with block metadata and USD valuations.
Using an open-source Python pipeline with dynamic batch sizing and automatic sharding (each file less than or equal to 1 million rows), we ensure strict chronological ordering and full reproducibility. The resulting publicly available dataset enables granular analysis of capital flows, interest rate dynamics, liquidation cascades, and cross-chain user behavior, providing a foundational resource for future studies on decentralized lending markets and systemic risk.