Domain Adaptation of Foundation LLMs for e-Commerce

📅 2025-01-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the limited domain adaptability of large language models (LLMs) in e-commerce, this paper introduces e-Llama: a domain-specific foundational model built upon Llama 3.1 via continual pretraining on 1 trillion multilingual e-commerce tokens, yielding 8B- and 70B-parameter variants. Methodologically, we curate high-quality e-commerce data, employ ablation-driven hyperparameter optimization, apply model merging techniques, and establish a unified evaluation framework. We propose a novel multilingual e-commerce benchmark to rigorously assess domain adaptation without compromising general-purpose capabilities, and introduce a fusion mechanism that jointly integrates base and adapter models for fine-grained, controllable cross-domain performance trade-offs. Experimental results demonstrate that e-Llama significantly outperforms baselines across diverse e-commerce tasks while fully preserving Llama 3.1’s general linguistic competence; the fusion strategy enables flexible, adjustable balancing between domain-specific and general-purpose performance.

Technology Category

Application Category

📝 Abstract
We present the e-Llama models: 8 billion and 70 billion parameter large language models that are adapted towards the e-commerce domain. These models are meant as foundation models with deep knowledge about e-commerce, that form a base for instruction- and fine-tuning. The e-Llama models are obtained by continuously pretraining the Llama 3.1 base models on 1 trillion tokens of domain-specific data. We discuss our approach and motivate our choice of hyperparameters with a series of ablation studies. To quantify how well the models have been adapted to the e-commerce domain, we define and implement a set of multilingual, e-commerce specific evaluation tasks. We show that, when carefully choosing the training setup, the Llama 3.1 models can be adapted towards the new domain without sacrificing significant performance on general domain tasks. We also explore the possibility of merging the adapted model and the base model for a better control of the performance trade-off between domains.
Problem

Research questions and friction points this paper is trying to address.

Large Language Models
E-commerce
Online Shopping Tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

e-commerce language model
multilingual optimization
combined model performance
🔎 Similar Papers
No similar papers found.
Christian Herold
Christian Herold
NLP Researcher at eBay
Machine LearningNeural NetworksMachine Translation
M
Michael Kozielski
eBay Inc.
T
Tala Bazazo
eBay Inc.
P
Pavel Petrushkov
eBay Inc.
Hadi Hashemi
Hadi Hashemi
eBay Inc.
P
Patrycja Cieplicka
eBay Inc.
D
Dominika Basaj
eBay Inc.
Shahram Khadivi
Shahram Khadivi
eBay Inc.
Natural Language Processing and Machine Learning