Efficient Optimization of Hierarchical Identifiers for Generative Recommendation

📅 2025-12-20

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

To address the high computational cost and poor scalability of tree construction in the SEATER generative recommendation model under large-scale item scenarios, this paper proposes an efficient hierarchical identifier construction method. Our approach introduces greedy and hybrid hierarchical tree-building algorithms, integrates balanced tree structural modeling, a contrastive learning objective function, and multi-granularity grouping strategies—collectively accelerating tree construction significantly. Experiments on large-scale music datasets (e.g., Yambda) demonstrate that our method reduces tree construction time to only 2%–8% of the original SEATER’s runtime, while maintaining or even improving retrieval quality. All code and datasets are publicly released to ensure full reproducibility.

Technology Category

Application Category

📝 Abstract

SEATER is a generative retrieval model that improves recommendation inference efficiency and retrieval quality by utilizing balanced tree-structured item identifiers and contrastive training objectives. We reproduce and validate SEATER's reported improvements in retrieval quality over strong baselines across all datasets from the original work, and extend the evaluation to Yambda, a large-scale music recommendation dataset. Our experiments verify SEATER's strong performance, but show that its tree construction step during training becomes a major bottleneck as the number of items grows. To address this, we implement and evaluate two alternative construction algorithms: a greedy method optimized for minimal build time, and a hybrid method that combines greedy clustering at high levels with more precise grouping at lower levels. The greedy method reduces tree construction time to less than 2% of the original with only a minor drop in quality on the dataset with the largest item collection. The hybrid method achieves retrieval quality on par with the original, and even improves on the largest dataset, while cutting construction time to just 5-8%. All data and code are publicly available for full reproducibility at https://github.com/joshrosie/re-seater.

Problem

Research questions and friction points this paper is trying to address.

Optimizes hierarchical identifiers for generative recommendation efficiency

Addresses tree construction bottleneck in large-scale item datasets

Evaluates alternative algorithms to reduce training time while maintaining quality

Innovation

Methods, ideas, or system contributions that make the work stand out.

Balanced tree-structured identifiers improve retrieval efficiency

Contrastive training objectives enhance recommendation quality

Hybrid tree construction algorithm reduces training time significantly

🔎 Similar Papers

End-to-End Learnable Item Tokenization for Generative Recommendation