🤖 AI Summary
This study addresses the challenge of effectively integrating auxiliary data generated by large language models to enhance human-label-driven decision-making while avoiding bias and efficiency loss arising from misspecification of the relationship between AI outputs and true labels. The authors propose Generative Augmented Inference (GAI), a novel framework that treats high-dimensional AI outputs as informative features and employs orthogonal moment conditions to construct a nonparametric inference procedure—without requiring assumptions about the functional form linking AI predictions to labels. GAI guarantees a “safe default” property: its performance never falls below that of an estimator using only human annotations, and it achieves substantial efficiency gains when auxiliary signals are informative. Empirical results demonstrate that GAI reduces estimation error by approximately 50%, cuts human labeling requirements by over 75%, and significantly outperforms baseline methods in tasks such as retail pricing and health insurance selection, maintaining decision accuracy and improving confidence interval coverage even with drastically reduced annotation effort.
📝 Abstract
Data-driven operations management often relies on parameters estimated from costly human-generated labels. Recent advances in large language models (LLMs) and other AI systems offer inexpensive auxiliary data, but introduce a new challenge: AI outputs are not direct observations of the target outcomes, but could involve high-dimensional representations with complex and unknown relationships to human labels. Conventional methods leverage AI predictions as direct proxies for true labels, which can be inefficient or unreliable when this relationship is weak or misspecified. We propose Generative Augmented Inference (GAI), a general framework that incorporates AI-generated outputs as informative features for estimating models of human-labeled outcomes. GAI uses an orthogonal moment construction that enables consistent estimation and valid inference with flexible, nonparametric relationship between LLM-generated outputs and human labels. We establish asymptotic normality and show a "safe default" property: relative to human-data-only estimators, GAI weakly improves estimation efficiency under arbitrary auxiliary signals and yields strict gains whenever the auxiliary information is predictive. Empirically, GAI outperforms benchmarks across diverse settings. In conjoint analysis with weak auxiliary signals, GAI reduces estimation error by about 50% and lowers human labeling requirements by over 75%. In retail pricing, where all methods access the same auxiliary inputs, GAI consistently outperforms alternative estimators, highlighting the value of its construction rather than differences in information. In health insurance choice, it cuts labeling requirements by over 90% while maintaining decision accuracy. Across applications, GAI improves confidence interval coverage without inflating width. Overall, GAI provides a principled and scalable approach to integrating AI-generated information.