On the Embedding Collapse when Scaling up Recommendation Models

التفاصيل البيبلوغرافية
العنوان: On the Embedding Collapse when Scaling up Recommendation Models
المؤلفون: Guo, Xingzhuo, Pan, Junwei, Wang, Ximei, Chen, Baixu, Jiang, Jie, Long, Mingsheng
سنة النشر: 2023
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Machine Learning, Computer Science - Information Retrieval
الوصف: Recent advances in foundation models have led to a promising trend of developing large recommendation models to leverage vast amounts of available data. Still, mainstream models remain embarrassingly small in size and na\"ive enlarging does not lead to sufficient performance gain, suggesting a deficiency in the model scalability. In this paper, we identify the embedding collapse phenomenon as the inhibition of scalability, wherein the embedding matrix tends to occupy a low-dimensional subspace. Through empirical and theoretical analysis, we demonstrate a \emph{two-sided effect} of feature interaction specific to recommendation models. On the one hand, interacting with collapsed embeddings restricts embedding learning and exacerbates the collapse issue. On the other hand, interaction is crucial in mitigating the fitting of spurious features as a scalability guarantee. Based on our analysis, we propose a simple yet effective multi-embedding design incorporating embedding-set-specific interaction modules to learn embedding sets with large diversity and thus reduce collapse. Extensive experiments demonstrate that this proposed design provides consistent scalability and effective collapse mitigation for various recommendation models. Code is available at this repository: https://github.com/thuml/Multi-Embedding.
Comment: ICML 2024 Accepted
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2310.04400
رقم الأكسشن: edsarx.2310.04400
قاعدة البيانات: arXiv