From Clicks to Conversions: Recommendation for long-term reward

Chagniot, Philomène, Vasile, Flavian, Rohde, David

Sep-1-2020–arXiv.org Machine Learning

A modern approach to recommendation will look at this log in order to improve future recommendations. By examining how similar users respond to different recommendations it becomes possible to discover better recommendations and continue to improve the system. This procedure of learning by experimentation in some respects mimics randomized control trials in medicine where populations are split into two and different treatments are delivered to similar groups. Medical trials are however simpler, as an intervention or a placebo is administered to each group and then long-term impacts are observed with no further interventions delivered. The challenges of credit attribution in the case of delayed reward and multiple actions. In contrast with medical trials, where the treatment is frequently a binary variable, recommender systems will deliver multiple actions at variable times leading to combinatorially complex treatments. For simplicity, in our previous work on RecoGym[2], we assumed that both the current recommendation and the reward are conditionally independent on past actions, therefore making the recommendation amenable to contextual bandits and supervised value modeling approaches.

artificial intelligence, machine learning, recommendation, (14 more...)

arXiv.org Machine Learning

Sep-1-2020

arXiv.org PDF

Add feedback

Country:
- North America > United States > California > Los Angeles County > Long Beach (0.05)

Genre:
- Research Report (0.91)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning (1.00)
  - Representation & Reasoning > Personal Assistant Systems (0.53)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found