How Well Do LLMs Predict Human Behavior? A Measure of their Pretrained Knowledge

📅 2026-01-18

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This study addresses the absence of a unified framework for quantifying the extent to which large language models (LLMs) rely on pretraining knowledge when predicting human behavior. To this end, it introduces “equivalent sample size” as a novel metric that estimates the amount of task-specific data required for an LLM to achieve its observed predictive accuracy. The authors develop an asymptotic statistical inference framework by integrating flexible machine learning techniques, cross-validation, and comparative analysis of prediction errors. Empirical validation on dynamic panel data of household income reveals that LLMs encode substantial predictive information for certain economic variables but limited utility for others, demonstrating that their value as a substitute for domain-specific data is highly context-dependent.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) are increasingly used to predict human behavior. We propose a measure for evaluating how much knowledge a pretrained LLM brings to such a prediction: its equivalent sample size, defined as the amount of task-specific data needed to match the predictive accuracy of the LLM. We estimate this measure by comparing the prediction error of a fixed LLM in a given domain to that of flexible machine learning models trained on increasing samples of domain-specific data. We further provide a statistical inference procedure by developing a new asymptotic theory for cross-validated prediction error. Finally, we apply this method to the Panel Study of Income Dynamics. We find that LLMs encode considerable predictive information for some economic variables but much less for others, suggesting that their value as substitutes for domain-specific data differs markedly across settings.

Problem

Research questions and friction points this paper is trying to address.

large language models

human behavior prediction

pretrained knowledge

equivalent sample size

predictive accuracy

Innovation

Methods, ideas, or system contributions that make the work stand out.

equivalent sample size

large language models

predictive accuracy

cross-validated prediction error

pretrained knowledge

🔎 Similar Papers

No similar papers found.

Authors to Follow