Economics of Sourcing Human Data

📅 2025-02-11

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

The large-scale deployment of AI is exacerbating quality degradation in human-generated data platforms; existing data collection systems over-rely on extrinsic incentives while neglecting intrinsic motivation, leading to declining user engagement and annotation distortion. Method: This paper pioneers the integration of behavioral economics into human-AI collaborative data production, proposing a novel data acquisition paradigm centered on intrinsic motivators—autonomy, competence, and relatedness. It combines motivational theory, platform mechanism design, empirical user behavior analysis, and multidimensional data quality evaluation. Contribution/Results: We empirically demonstrate that intrinsic motivation significantly improves annotation accuracy and long-term user retention. The study establishes a sustainable, trustworthy governance framework for human-generated data ecosystems, delivering actionable design principles and implementation pathways for high-quality AI data infrastructure.

Technology Category

Application Category

📝 Abstract

Progress in AI has relied on human-generated data, from annotator marketplaces to the wider Internet. However, the widespread use of large language models now threatens the quality and integrity of human-generated data on these very platforms. We argue that this issue goes beyond the immediate challenge of filtering AI-generated content--it reveals deeper flaws in how data collection systems are designed. Existing systems often prioritize speed, scale, and efficiency at the cost of intrinsic human motivation, leading to declining engagement and data quality. We propose that rethinking data collection systems to align with contributors' intrinsic motivations--rather than relying solely on external incentives--can help sustain high-quality data sourcing at scale while maintaining contributor trust and long-term participation.

Problem

Research questions and friction points this paper is trying to address.

AI-generated content threatens data quality

Current systems neglect human intrinsic motivation

Rethinking data collection enhances long-term engagement

Innovation

Methods, ideas, or system contributions that make the work stand out.

AI data quality improvement

Intrinsic motivation alignment

Scalable data collection redesign

🔎 Similar Papers

No similar papers found.

Authors to Follow