Future-as-Label: Scalable Supervision from Real-World Outcomes

πŸ“… 2026-01-09
πŸ›οΈ arXiv.org
πŸ“ˆ Citations: 1
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Real-world forecasting often suffers from label delay, as ground-truth outcomes become available only after events occur, rendering conventional supervised learning inapplicable. This work proposes a β€œfuture-as-label” paradigm by formulating forecasting as a reinforcement learning problem with verifiable rewards: post-hoc ground-truth outcomes serve as supervision signals to train language models under causal masking constraints, enabling probabilistic predictions that are retrospectively evaluated using strictly proper scoring rules such as the Brier score. The approach requires no manual annotation and enables end-to-end learning from delayed rewards. Experiments demonstrate that the Qwen3-32B model achieves a 27% improvement in Brier score and halves calibration error on real-world forecasting tasks from Metaculus, outperforming the significantly larger Qwen3-235B model despite having only one-seventh of its parameters.

Technology Category

Application Category

πŸ“ Abstract
Time creates free supervision: forecasts about real-world events resolve to verifiable outcomes. The passage of time provides labels that require no annotation. To exploit this structure, we extend reinforcement learning with verifiable rewards to real-world prediction over time. We train language models to make probabilistic forecasts from causally masked information, using proper scoring rules as the reward function once events resolve. Learning is driven entirely by realized outcomes, enabling scalable outcome-based supervision in open-world prediction. On real-world forecasting benchmarks, Qwen3-32B trained using Foresight Learning improves Brier score by 27% and halves calibration error relative to its pretrained baseline, and outperforms Qwen3-235B on both constructed future-event prediction tasks and the Metaculus benchmark despite a 7x parameter disadvantage.
Problem

Research questions and friction points this paper is trying to address.

temporal gap
delayed supervision
real-world forecasting
outcome-based labeling
prediction without immediate labels
Innovation

Methods, ideas, or system contributions that make the work stand out.

Future-as-Label
delayed supervision
probabilistic forecasting
causally masked information
proper scoring rules
Benjamin Turtel
Benjamin Turtel
CEO @ Lightning Rod Labs
AIMLForecasting
P
Paul Wilczewski
Lightning Rod Labs
D
Danny Franklin
Lightning Rod Labs
K
Kris Skothiem
Lightning Rod Labs