🤖 AI Summary
This work addresses the limited representational capacity of conventional resume encoders in occupational transition prediction. We propose a novel paradigm for semantic occupational representation grounded in large language models (LLMs). Methodologically, structured career history data is converted into resume-like textual sequences, and small-to-medium-sized LLMs are fine-tuned via next-token prediction as a pretraining objective; the resulting embeddings are integrated into downstream transition prediction models. Our key contributions are twofold: first, we pioneer the replacement of traditional Transformer-based resume encoders with LLMs as the foundational representation mechanism; second, we empirically demonstrate that compact LLMs—when fine-tuned on diverse occupational data—outperform larger counterparts. Experiments on predicting workers’ next occupations show significant improvements over state-of-the-art baselines including CAREER, validating the effectiveness, robustness, and scalability of language-based occupational representation.
📝 Abstract
Vafa et al. (2024) introduced a transformer-based econometric model, CAREER, that predicts a worker's next job as a function of career history (an"occupation model"). CAREER was initially estimated ("pre-trained") using a large, unrepresentative resume dataset, which served as a"foundation model,"and parameter estimation was continued ("fine-tuned") using data from a representative survey. CAREER had better predictive performance than benchmarks. This paper considers an alternative where the resume-based foundation model is replaced by a large language model (LLM). We convert tabular data from the survey into text files that resemble resumes and fine-tune the LLMs using these text files with the objective to predict the next token (word). The resulting fine-tuned LLM is used as an input to an occupation model. Its predictive performance surpasses all prior models. We demonstrate the value of fine-tuning and further show that by adding more career data from a different population, fine-tuning smaller LLMs surpasses the performance of fine-tuning larger models.