Non-Stationary Contextual Bandit Learning via Neural Predictive Ensemble Sampling

Zhu, Zheqing, Liu, Yueyang, Kuang, Xu, Van Roy, Benjamin

Oct-14-2023–arXiv.org Artificial Intelligence

Real-world applications of contextual bandits often exhibit non-stationarity due to seasonality, serendipity, and evolving social trends. While a number of non-stationary contextual bandit learning algorithms have been proposed in the literature, they excessively explore due to a lack of prioritization for information of enduring value, or are designed in ways that do not scale in modern applications with high-dimensional user-specific features and large action set, or both. In this paper, we introduce a novel non-stationary contextual bandit algorithm that addresses these concerns. It combines a scalable, deep-neural-network-based architecture with a carefully designed exploration mechanism that strategically prioritizes collecting information with the most lasting value in a non-stationary environment. Through empirical evaluations on two real-world recommendation datasets, which exhibit pronounced non-stationarity, we demonstrate that our approach significantly outperforms the state-of-the-art baselines.

bandit, data mining, machine learning, (20 more...)

arXiv.org Artificial Intelligence

Oct-14-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States > California > Santa Clara County (0.14)

Genre:
- Research Report (0.50)

Industry:
- Information Technology (0.46)

Technology:
- Information Technology
  - Artificial Intelligence
    - Machine Learning > Neural Networks
      - Deep Learning (0.66)
    - Representation & Reasoning (1.00)
  - Data Science > Data Mining
    - Big Data (0.89)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found