A Model-Based Reinforcement Learning with Adversarial Training for Online Recommendation

Oct-11-2024, 03:47:23 GMT–Neural Information Processing Systems

Reinforcement learning is effective in optimizing policies for recommender systems. Current solutions mostly focus on model-free approaches, which require frequent interactions with a real environment, and thus are expensive in model learning. Offline evaluation methods, such as importance sampling, can alleviate such limitations, but usually request a large amount of logged data and do not work well when the action space is large. In this work, we propose a model-based reinforcement learning solution which models the user-agent interaction for offline policy learning via a generative adversarial network. To reduce bias in the learnt policy, we use the discriminator to evaluate the quality of generated sequences and rescale the generated rewards.

adversarial training, model-based reinforcement learning, online recommendation, (1 more...)

Neural Information Processing Systems

Oct-11-2024, 03:47:23 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.93)