π€ AI Summary
This work addresses key limitations in industrial e-commerce searchβnamely, its reliance on multi-path recall without end-to-end optimization and the inability of existing generative retrieval methods to support real-time lexicon edits without retraining. The authors propose OneRetrieval, a novel framework that integrates keyword-aligned encoding (KAE), an information-theoretic non-uniform capacity codebook, bindable slots, and a four-stage joint fine-tuning strategy. This approach achieves, for the first time, generative retrieval with both high recall and inverted-index-like real-time editability. Evaluated on five million real-world queries, OneRetrieval matches the deep recall performance of the strongest generative baseline while improving intervention hit rates by over an order of magnitude. Deployed online as a replacement for the inverted-index branch, it significantly boosts order volume; full-stage deployment maintains conversion rates and increases CTR, now stably serving hundreds of millions of daily page views on Kuaishou.
π Abstract
Industrial e-commerce search serves hundreds of millions of items through a multi-branch retrieval stage fused by hand-tuned merging without joint optimization. Generative retrieval (GR) raises the prospect of collapsing this stage into a single model, yet unification is gated by more than retrieval quality: the inverted-index branch converts below the platform average yet persists because it is almost the only branch where operations can inject a new term within hours without any model update; a one-model substitute must preserve this real-time editability. Existing GR methods structurally lack it: closed-codebook methods fix each slot to a quantized embedding at training, while open-vocabulary methods leave new-term routing to model generalization. We present OneRetrieval, a one-model GR framework built on Keyword-Aligned Encoding (KAE), which ties each identifier position to an interpretable attribute word, pairing competitive recall quality with the editability of the inverted index -- to our knowledge the first editable generative retrieval method. An information-theoretic merging organizes 18 attribute categories into six codebook groups with non-uniform capacity; reserved slots in each codebook can be bound to new words after deployment without retraining; and a four-stage fine-tuning pipeline secures quality and editability jointly. On five million real-traffic requests, OneRetrieval matches the deep recall of the strongest generative baseline, with an intervention hit rate over an order of magnitude above closed-codebook encodings. Online, replacing the inverted-index branch significantly lifts order volume; extending to nearly the entire stage holds conversion while improving CTR. The system is deployed at Kuaishou, serving hundreds of millions of PVs daily.