FAFO: Over 1 million TPS on a single node running EVM while still Merkleizing every block

📅 2025-07-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Blockchain execution layers suffer from limited parallelism and throughput bottlenecks due to data contention among transactions. To address this, we propose a pre-execution transaction reordering scheduler—the first parallel execution framework that simultaneously achieves high concurrency and block-level Merklized state generation. Our method employs a cache-friendly, CPU-optimized Bloom filter for lightweight conflict detection, and integrates REVM (Rust-based EVM) with QMDB to enable efficient, on-the-fly Merklization of the world state per block—meeting requirements of light clients and ZK-based stateless verification. Experimental results demonstrate that a single node achieves 1.1 million TPS for ETH transfers and 500 KTPS for ERC-20 transfers. Compared to state-of-the-art sharding approaches, our solution reduces infrastructure cost by 91% while exhibiting near-linear scalability.

Technology Category

Application Category

📝 Abstract
Current blockchain execution throughput is limited by data contention, reducing execution layer parallelism. Fast Ahead-of-Formation Optimization (FAFO) is the first blockchain transaction scheduler to address this problem by reordering transactions before block formation for maximum concurrency. FAFO uses CPU-optimized cache-friendly Bloom filters to efficiently detect conflicts and schedule parallel transaction execution at high throughput and low overhead. We integrate the Rust EVM client (REVM) into FAFO and achieve over 1.1 million native ETH transfers per second and over half a million ERC20 transfers per second on a single node (Table 1), with 91% lower cost compared to state-of-the-art sharded execution. Unlike many other existing high throughput blockchain execution clients, FAFO uses QMDB to Merkleize world state after every block, enabling light clients and stateless validation for ZK-based vApps. FAFO scales with minimal synchronization overhead, scaling linearly with additional CPU resources until it fully exploits the maximum parallelism of the underlying transaction flow. FAFO proves that the high throughput necessary to support future decentralized applications can be achieved with a streamlined execution layer and innovations in blockchain transaction scheduler design. FAFO is open-sourced at https://github.com/LayerZero-Labs/fafo.
Problem

Research questions and friction points this paper is trying to address.

Enhances EVM throughput by reducing data contention in blockchains
Optimizes transaction scheduling for maximum parallel execution
Ensures Merkleized blocks for light clients and stateless validation
Innovation

Methods, ideas, or system contributions that make the work stand out.

FAFO reorders transactions for maximum concurrency
Uses CPU-optimized Bloom filters for conflict detection
Merkleizes world state after every block
🔎 Similar Papers
No similar papers found.
R
Ryan Zarick
LayerZero Labs Ltd
I
Isaac Zhang
LayerZero Labs Ltd
Daniel Wong
Daniel Wong
Associate Professor, University of California, Riverside
Computer ArchitectureEnergy EfficiencyHigh Performance Computing
T
Thomas Kim
LayerZero Labs Ltd
B
Bryan Pellegrino
LayerZero Labs Ltd
M
Mignon Li
LayerZero Labs Ltd
K
Kelvin Wong
LayerZero Labs Ltd