🤖 AI Summary
This study addresses the polarized landscape of current AI training data markets, which oscillate between “free use” and strong intellectual property regimes, resulting in insufficient supply of high-quality original content. By formulating a static Stackelberg game and a dynamic feedback model, the paper uncovers two key market failure mechanisms: “originality penalty” and the “curse of accuracy.” Building on these insights, it proposes a novel dynamic market design featuring a data intermediary that internalizes creators’ externalities and provides innovation subsidies, thereby balancing technological advancement with creative incentives. Theoretical analysis demonstrates that this mechanism effectively restores incentive efficiency, mitigates content homogenization, and enhances the long-term performance of AI models.
📝 Abstract
How can we design a market of human-generated content for use in training AI models that both enables technological progress and preserves individual incentives for high-quality content creation? Existing approaches take polar positions: a "free-for-all" model based on fair use and a "strong intellectual property rights" model. We show that both fail: Free-for-all does not compensate creators, and -- by modeling as a static Stackelberg game -- strong intellectual property rights also underpower creative incentives. We find this especially true for more innovative creators, a phenomenon we term the "originality penalty." Extending this insight to a dynamic model, we find another market failure undermining AI model performance, even for an initially good model: Such a model induces greater reliance by humans on AI-assisted creation, resulting in homogenized content feeding back into training, which degrades the model performance -- a "curse of precision." We further propose a market design with a data intermediary internalizing cross-creator externalities and subsidizing innovative contributions, thereby restoring efficiency.