Thresholding Graph Bandits with GrAPL

LeJeune, Daniel, Dasarathy, Gautam, Baraniuk, Richard G.

arXiv.org Machine Learning 

Systems that recommend products, services, or other attention-targets have become indispensable in the effective curation of information. Such personalization and recommendation techniques have become ubiquitous not only in product/content recommendation and ad placements but also in a wide range of applications like drug testing, spatial sampling, environmental monitoring, and rate adaptation in communication networks; see e.g., Villar et al. (2015); Combes et al. (2014); Srinivas et al. (2010). These are often modeled as sequential decision making or bandit problems, where an algorithm needs to choose among a set of decisions (or arms) sequentially to maximize a desired performance criterion. Recently, an important variant of the bandit problem was proposed by Locatelli et al. (2016) and Gotovos et al. (2013), where the goal is to rapidly identify all arms that are above (and below) a fixed threshold. This thresholding bandit framework, which may be thought of as a version of the combinatorial pure exploration problem (Chen et al., 2014), is useful in various applications like environmental monitoring, where one might want to identify the hypoxic (low-oxygen-content) regions in a lake; like crowd-sourcing, where one might want to keep all workers whose productivity trumps the cost to hire them; or like political polling, where one wants to identify which political candidate individual voting districts prefer.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found