Sliding Window Training -- Utilizing Historical Recommender Systems Data for Foundation Models

Joshi, Swanand, Feng, Yesu, Hsiao, Ko-Jen, Zhang, Zhe, Lamkhede, Sudarshan

arXiv.org Artificial Intelligence 

Long-lived recommender systems (RecSys) often encounter lengthy Oftentimes in industrial applications, foundation models that user-item interaction histories that span many years. To effectively have inference time restrictions on serving memory footprint cannot learn long term user preferences, Large RecSys foundation models exceed a certain input dimension and model size. This constraint (FM) need to encode this information in pretraining. Usually, this raises a question on how to most effectively utilize a large-scale is done by either generating a long enough sequence length to interaction corpus [1]. The most straightforward way is to truncate take all history sequences as input at the cost of large model input historical interactions. This simplification, however, comes at the dimension or by dropping some parts of the user history to accommodate cost of not using valuable information about user journeys and model size and latency requirements on the production their rich history of interactions during model training [5].