Reinforcement Learning for Slate-based Recommender Systems: A Tractable Decomposition and Practical Methodology
Ie, Eugene, Jain, Vihan, Wang, Jing, Narvekar, Sanmit, Agarwal, Ritesh, Wu, Rui, Cheng, Heng-Tze, Lustman, Morgane, Gatto, Vince, Covington, Paul, McFadden, Jim, Chandra, Tushar, Boutilier, Craig
–arXiv.org Artificial Intelligence
Recommender systems have become ubiquitous, transforming user interactions with products, services and content in a wide variety of domains. In content recommendation, recommenders generally surface relevant and/or novel personalized content based on learned models of user preferences (e.g., as in collaborative filtering [Breese et al., 1998, Konstan et al., 1997, Srebro et al., 2004, Salakhutdinov and Mnih, 2007]) or predictive models of user responses to specific recommendations. Well-known applications of recommender systems include video recommendations on YouTube [Covington et al., 2016], movie recommendations on Netflix [Gomez-Uribe and Hunt, 2016] and playlist construction on Spotify [Jacobson et al., 2016]. It is increasingly common to train deep neural networks (DNNs) [van den Oord et al., 2013, Wang et al., 2015, Covington et al., 2016, Cheng et al., 2016] to predict user responses (e.g., click-through rates, content engagement, ratings, likes) to generate, score and serve candidate recommendations. Practical recommender systems largely focus on myopic prediction--estimating a user's immediate response to a recommendation--without considering the long-term impact on subsequent user behavior. This can be limiting: modeling a recommendation's stochastic impact on the future affords opportunities to trade off user engagement in the near-term for longer-term benefit (e.g., by probing a user's interests, or improving satisfaction).
arXiv.org Artificial Intelligence
May-31-2019
- Country:
- South America > Argentina
- Pampas > Buenos Aires F.D. > Buenos Aires (0.04)
- Oceania > Australia
- North America > United States
- New York (0.04)
- Wisconsin > Dane County
- Madison (0.04)
- Texas > Travis County
- Austin (0.04)
- North Carolina > Wake County
- Raleigh (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.04)
- Massachusetts
- Suffolk County > Boston (0.04)
- Middlesex County
- Louisiana > Orleans Parish
- New Orleans (0.04)
- California
- San Francisco County > San Francisco (0.04)
- Los Angeles County > Long Beach (0.04)
- Europe
- United Kingdom > England
- Cambridgeshire > Cambridge (0.28)
- Sweden > Stockholm
- Stockholm (0.04)
- Italy > Piedmont
- Turin Province > Turin (0.04)
- United Kingdom > England
- Asia
- Middle East > Jordan (0.04)
- Macao (0.04)
- Myanmar > Tanintharyi Region
- Dawei (0.04)
- China > Beijing
- Beijing (0.04)
- South America > Argentina
- Genre:
- Research Report > New Finding (0.93)
- Industry:
- Media (1.00)
- Leisure & Entertainment (1.00)
- Information Technology > Services (0.66)