DRIVE: Distributional and Retrieval-Augmented Bidding with Value Evaluation

📅 2026-06-12

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the limitations of unimodal parameterization in offline auto-bidding—namely, policy averaging and unstable performance under sparse or long-tailed traffic—by introducing DRIVE, a novel framework that integrates distributional reinforcement learning with retrieval-augmented mechanisms into a Transformer-based bidding architecture. DRIVE decouples action generation from decision-making and leverages distributed action modeling, retrieval of high-quality historical bids, and value estimation to select the optimal bid during inference. Experimental results demonstrate that DRIVE significantly outperforms existing methods on AuctionNet and multiple offline reinforcement learning benchmarks, while exhibiting strong generalization across various Transformer variants.

📝 Abstract

Auto-bidding is a core component of real-time advertising systems, where decisions must optimize long-term performance under budget and cost constraints, while online exploration is prohibitively risky. Offline reinforcement learning and, more recently, Transformer-based sequence modeling have shown promise for learning bidding policies from logged data, but their unimodal and purely parametric formulations often collapse multiple effective bidding strategies into suboptimal averaged actions and perform unreliably under sparse or long-tail traffic. To mitigate these limitations, we propose DRIVE (Distributional and Retrieval-Augmented Bidding with Value Evaluation), a unified Transformer-based framework that decouples candidate action generation from decision making for offline auto-bidding. DRIVE combines distributional action modeling, retrieval-augmented candidate generation from high-quality historical decisions, and value-based evaluation to select the most promising bid at inference time. Extensive experiments on AuctionNet and additional offline reinforcement learning benchmarks demonstrate that DRIVE consistently improves bidding performance and generalizes well across multiple Transformer-based methods.

Problem

Research questions and friction points this paper is trying to address.

auto-bidding

offline reinforcement learning

Transformer-based sequence modeling

long-tail traffic

budget constraints

Innovation

Methods, ideas, or system contributions that make the work stand out.

distributional bidding

retrieval-augmented generation

offline reinforcement learning