interactive submodular bandit
Interactive Submodular Bandit
In many machine learning applications, submodular functions have been used as a model for evaluating the utility or payoff of a set such as news items to recommend, sensors to deploy in a terrain, nodes to influence in a social network, to name a few. At the heart of all these applications is the assumption that the underlying utility/payoff function is known a priori, hence maximizing it is in principle possible. In real life situations, however, the utility function is not fully known in advance and can only be estimated via interactions. For instance, whether a user likes a movie or not can be reliably evaluated only after it was shown to her. Or, the range of influence of a user in a social network can be estimated only after she is selected to advertise the product.
Reviews: Interactive Submodular Bandit
Pros: - The paper combines submodular function maximization with contextual bandit, and propose the problem of interactive submodular bandit formulation and SM-UCB solution. It includes several existing studies as special cases. Cons: - The discussion on the key RKHS condition is not enough, and its applicability for actual applications is unclear. More importantly, there is no discussion about constant B for the RKHS norm for various applications, and also the maximum information gain \gamma_t, but both of them are needed in the algorithm. The additional technical contribution is small.
Interactive Submodular Bandit
Chen, Lin, Krause, Andreas, Karbasi, Amin
In many machine learning applications, submodular functions have been used as a model for evaluating the utility or payoff of a set such as news items to recommend, sensors to deploy in a terrain, nodes to influence in a social network, to name a few. At the heart of all these applications is the assumption that the underlying utility/payoff function is known a priori, hence maximizing it is in principle possible. In real life situations, however, the utility function is not fully known in advance and can only be estimated via interactions. For instance, whether a user likes a movie or not can be reliably evaluated only after it was shown to her. Or, the range of influence of a user in a social network can be estimated only after she is selected to advertise the product.