🤖 AI Summary
To address the limited domain adaptability of large language models (LLMs) in e-commerce, this paper introduces e-Llama: a domain-specific foundational model built upon Llama 3.1 via continual pretraining on 1 trillion multilingual e-commerce tokens, yielding 8B- and 70B-parameter variants. Methodologically, we curate high-quality e-commerce data, employ ablation-driven hyperparameter optimization, apply model merging techniques, and establish a unified evaluation framework. We propose a novel multilingual e-commerce benchmark to rigorously assess domain adaptation without compromising general-purpose capabilities, and introduce a fusion mechanism that jointly integrates base and adapter models for fine-grained, controllable cross-domain performance trade-offs. Experimental results demonstrate that e-Llama significantly outperforms baselines across diverse e-commerce tasks while fully preserving Llama 3.1’s general linguistic competence; the fusion strategy enables flexible, adjustable balancing between domain-specific and general-purpose performance.
📝 Abstract
We present the e-Llama models: 8 billion and 70 billion parameter large language models that are adapted towards the e-commerce domain. These models are meant as foundation models with deep knowledge about e-commerce, that form a base for instruction- and fine-tuning. The e-Llama models are obtained by continuously pretraining the Llama 3.1 base models on 1 trillion tokens of domain-specific data. We discuss our approach and motivate our choice of hyperparameters with a series of ablation studies. To quantify how well the models have been adapted to the e-commerce domain, we define and implement a set of multilingual, e-commerce specific evaluation tasks. We show that, when carefully choosing the training setup, the Llama 3.1 models can be adapted towards the new domain without sacrificing significant performance on general domain tasks. We also explore the possibility of merging the adapted model and the base model for a better control of the performance trade-off between domains.