🤖 AI Summary
This work addresses the limitations of AI assistants in image-based e-commerce, where inadequate behavioral constraints for diverse and dynamic user intents—such as product search and style recommendation—lead to suboptimal response quality and compliance, compounded by the poor scalability of manually maintained skill systems. To overcome these challenges, we propose SkillChain, the first system that establishes a closed-loop framework for automatic skill discovery, alignment, and iterative refinement in image-centric e-commerce scenarios. SkillChain integrates a Skill Creator grounded in task specifications and interaction trajectories, a Route Optimizer for intent-aware routing alignment, and a Body Refiner powered by a dual-path LLM-Judge mechanism. Experimental results demonstrate substantial improvements in structural compliance and content quality of responses, with online A/B tests confirming significant gains in user engagement, content consumption, and long-term retention.
📝 Abstract
Image-based AI assistants are now deployed at production scale on e-commerce platforms, where a single uploaded image can trigger fundamentally different user intents: product search, style recommendation, visual encyclopedia, or utility tool calls, each demanding its own response format, tool invocation, and domain knowledge. Without per-intent behavioral constraints, LLM-based systems conflate these heterogeneous modes and fall short of domain quality standards, while the breadth and dynamism of the intent space render manual engineering infeasible. To address this, we present SkillChain, which closes the production feedback loop on Skill evolution, automating the lifecycle of Skills through three stages: Skill Creator for bootstrapping from task specs and trajectories, Route Optimizer for routing alignment, and Body Refiner for iterative Skill Body refinement via dual-path LLM-Judge evaluation. Deployed on a production-scale e-commerce image assistant, SkillChain substantially improves aggregate response quality, with the strongest gains on structural compliance and content quality; a one-week online A/B experiment further confirms significant gains in user engagement, content consumption, and long-term retention.