ADORE: Autonomous Domain-Oriented Relevance Engine for E-commerce

📅 2025-07-13
🏛️ Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
E-commerce search relevance modeling faces two key challenges: semantic gap between queries and items, and scarcity of domain-specific hard negative samples. To address these, we propose a three-module collaborative framework: (1) a chain-of-thought large language model that automatically generates high-quality training data with intent alignment and behavioral consistency; (2) error-type-aware adversarial sample synthesis to enhance model robustness; and (3) knowledge distillation incorporating hierarchical item critical attributes for lightweight, efficient relevance modeling. Integrating Kahneman–Tversky optimization with neural ranking techniques, our approach establishes a cognitively aligned, resource-efficient, and end-to-end self-sustaining learning system. Extensive offline evaluations and online A/B tests demonstrate significant improvements in search relevance, reduced reliance on manual annotation, and breakthrough performance in industrial deployment—achieving high accuracy, low latency, and strong robustness in ranking.

Technology Category

Application Category

📝 Abstract
Relevance modeling in e-commerce search remains challenged by semantic gaps in term-matching methods (e.g., BM25) and neural models' reliance on the scarcity of domain-specific hard samples. We propose ADORE, a self-sustaining framework that synergizes three innovations: (1) A Rule-aware Relevance Discrimination module, where a Chain-of-Thought LLM generates intent-aligned training data, refined via Kahneman-Tversky Optimization (KTO) to align with user behavior; (2) An Error-type-aware Data Synthesis module that auto-generates adversarial examples to harden robustness; and (3) A Key-attribute-enhanced Knowledge Distillation module that injects domain-specific attribute hierarchies into a deployable student model. ADORE automates annotation, adversarial generation, and distillation, overcoming data scarcity while enhancing reasoning. Large-scale experiments and online A/B testing verify the effectiveness of ADORE. The framework establishes a new paradigm for resource-efficient, cognitively aligned relevance modeling in industrial applications.
Problem

Research questions and friction points this paper is trying to address.

Addresses semantic gaps in e-commerce search relevance modeling.
Overcomes neural models' reliance on scarce domain-specific hard samples.
Automates annotation, adversarial generation, and distillation to enhance reasoning.
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM generates intent-aligned data with KTO optimization
Auto-generates adversarial examples to enhance robustness
Distills domain-specific attribute hierarchies into student model
🔎 Similar Papers
No similar papers found.
Z
Zheng Fang
JD.COM, Beijing, China
D
Donghao Xie
JD.COM, Beijing, China
M
Ming Pang
JD.COM, Beijing, China
Chunyuan Yuan
Chunyuan Yuan
Ph.D., UCAS China
NLP & IR
X
Xue Jiang
JD.COM, Beijing, China
C
Changping Peng
JD.COM, Beijing, China
Z
Zhangang Lin
JD.COM, Beijing, China
Zheng Luo
Zheng Luo
PhD student, UCLA