EXP4-DFDC: A Non-Stochastic Multi-Armed Bandit for Cache Replacement

Yusuf, Farzana Beente, Valdes, Camilo, Stebliankin, Vitalii, Vietri, Giuseppe, Narasimhan, Giri

arXiv.org Machine Learning 

In this work we study a variant of the well-known multi-armed bandit (MAB) problem, which has the properties of a delay in feedback, and a loss that declines over time. We introduce an algorithm, EXP4-DFDC, to solve this MAB variant, and demonstrate that the regret vanishes as the time increases. We also show that LeCaR, a previously published machine learning-based cache replacement algorithm, is an instance of EXP4-DFDC. Our results can be used to provide insight on the choice of hyperparameters, and optimize future LeCaR instances.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found