The Bidding Games: Reinforcement Learning for MEV Extraction on Polygon Blockchain

📅 2025-10-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Polygon Atlas’s sealed-bid MEV auction presents millisecond-scale, partially observable, and competitively uncertain bidding challenges. Method: We introduce the first high-fidelity simulation environment aligned with the real Atlas mechanism and design a history-aware, continuous-action agent based on Proximal Policy Optimization (PPO), integrating low-latency real-time inference with RL-driven optimization to overcome limitations of classical game-theoretic approaches in high-frequency dynamic settings. Contribution/Results: Experiments show our agent captures 49% of extractable value when coexisting with incumbent searchers, and achieves 81% when replacing the market leader—substantially outperforming static strategies. This work is the first to jointly model stochastic arbitrage opportunities and latent competition in Atlas auctions, delivering a scalable, production-ready RL framework for adaptive MEV extraction in structured blockchain auctions.

Technology Category

Application Category

📝 Abstract
In blockchain networks, the strategic ordering of transactions within blocks has emerged as a significant source of profit extraction, known as Maximal Extractable Value (MEV). The transition from spam-based Priority Gas Auctions to structured auction mechanisms like Polygon Atlas has transformed MEV extraction from public bidding wars into sealed-bid competitions under extreme time constraints. While this shift reduces network congestion, it introduces complex strategic challenges where searchers must make optimal bidding decisions within a sub-second window without knowledge of competitor behavior or presence. Traditional game-theoretic approaches struggle in this high-frequency, partially observable environment due to their reliance on complete information and static equilibrium assumptions. We present a reinforcement learning framework for MEV extraction on Polygon Atlas and make three contributions: (1) A novel simulation environment that accurately models the stochastic arrival of arbitrage opportunities and probabilistic competition in Atlas auctions; (2) A PPO-based bidding agent optimized for real-time constraints, capable of adaptive strategy formulation in continuous action spaces while maintaining production-ready inference speeds; (3) Empirical validation demonstrating our history-conditioned agent captures 49% of available profits when deployed alongside existing searchers and 81% when replacing the market leader, significantly outperforming static bidding strategies. Our work establishes that reinforcement learning provides a critical advantage in high-frequency MEV environments where traditional optimization methods fail, offering immediate value for industrial participants and protocol designers alike.
Problem

Research questions and friction points this paper is trying to address.

Optimizing bidding strategies in sealed-bid MEV auctions
Addressing partial observability in high-frequency blockchain environments
Developing adaptive agents for real-time Polygon Atlas competitions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Simulation environment models arbitrage opportunities and competition
PPO-based agent adapts strategies in real-time constraints
History-conditioned agent captures significant profit outperforming static strategies
🔎 Similar Papers
No similar papers found.
A
Andrei Seoev
MEV-X, Moscow, Russia
L
Leonid Gremyachikh
Independent Researcher, Moscow, Russia
A
Anastasiia Smirnova
Moscow Institute of Physics and Technology, Moscow, Russia
Y
Yash Madhwal
Skolkovo Institute of Science and Technology, Moscow, Russia
A
Alisa Kalacheva
Moscow Institute of Physics and Technology, Moscow, Russia
D
Dmitry Belousov
Moscow Institute of Physics and Technology, Moscow, Russia
I
Ilia Zubov
Moscow Institute of Physics and Technology, Moscow, Russia
A
Aleksei Smirnov
MEV-X, Moscow, Russia
D
Denis Fedyanin
HSE University, Moscow, Russia
V
Vladimir Gorgadze
Moscow Institute of Physics and Technology, Moscow, Russia
Yury Yanovich
Yury Yanovich
Skolkovo Institute of Science and Technology
BlockchainStatisticsMachine learning