Optimizing Audio Recommendations for the Long-Term: A Reinforcement Learning Perspective

Feb-28-2023–arXiv.org Artificial Intelligence

We study the problem of optimizing a recommender system for outcomes that occur over several weeks or months. We begin by drawing on reinforcement learning to formulate a comprehensive model of users' recurring relationships with a recommender system. Measurement, attribution, and coordination challenges complicate algorithm design. We describe careful modeling -- including a new representation of user state and key conditional independence assumptions -- which overcomes these challenges and leads to simple, testable recommender system prototypes. We apply our approach to a podcast recommender system that makes personalized recommendations to hundreds of millions of listeners. A/B tests demonstrate that purposefully optimizing for long-term outcomes leads to large performance gains over conventional approaches that optimize for short-term proxies.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

Feb-28-2023

arXiv.org PDF

Add feedback

Country:
- North America
  - United States
    - District of Columbia > Washington (0.04)
    - New York > New York County
      - New York City (0.04)
    - California > Los Angeles County
      - Los Angeles (0.14)
  - Puerto Rico > San Juan
    - San Juan (0.04)
- Asia
  - Middle East > Jordan (0.04)
  - Myanmar > Tanintharyi Region
    - Dawei (0.04)

Genre:
- Research Report
  - Experimental Study (0.67)
  - New Finding (0.45)

Industry:
- Leisure & Entertainment (1.00)
- Media > Music (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Personal Assistant Systems (1.00)
  - Machine Learning
    - Reinforcement Learning (1.00)
    - Learning Graphical Models > Undirected Networks
      - Markov Models (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found