Empirical Comparison of Forgetting Mechanisms for UCB-based Algorithms on a Data-Driven Simulation Platform

Nov-25-2025–arXiv.org Artificial Intelligence

Many real-world bandit problems involve non-stationary reward distributions, where the optimal decision may shift due to evolving environments. However, the performance of some typical Multi-Armed Bandit (MAB) models such as Upper Confidence Bound (UCB) algorithms degrades significantly in non-stationary environments where reward distributions change over time. To address this limitation, this paper introduces and evaluates FDSW-UCB, a novel dual-view algorithm that integrates a discount-based long-term perspective with a sliding-window-based short-term view. A data-driven semi-synthetic simulation platform, built upon the MovieLens-1M and Open Bandit datasets, is developed to test algorithm adaptability under abrupt and gradual drift scenarios. Experimental results demonstrate that a well-configured sliding-window mechanism (SW-UCB) is robust, while the widely used discounting method (D-UCB) suffers from a fundamental learning failure, leading to linear regret. Crucially, the proposed FDSW-UCB, when employing an optimistic aggregation strategy, achieves superior performance in dynamic settings, highlighting that the ensemble strategy itself is a decisive factor for success.

artificial intelligence, data mining, machine learning, (20 more...)

arXiv.org Artificial Intelligence

Nov-25-2025

arXiv.org PDF

Add feedback

Country:
- Asia (0.28)

Genre:
- Research Report > New Finding (0.48)

Industry:
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:
- Information Technology
  - Data Science > Data Mining
    - Big Data (1.00)
  - Artificial Intelligence
    - Representation & Reasoning > Personal Assistant Systems (0.93)
    - Machine Learning > Statistical Learning
      - Clustering (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found