bayesucb
- North America > United States > Texas > Travis County > Austin (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Information Technology > Data Science > Data Mining > Big Data (0.47)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
- North America > United States > Texas > Travis County > Austin (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Information Technology > Data Science > Data Mining > Big Data (0.47)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Finite-Time Logarithmic Bayes Regret Upper Bounds
Atsidakou, Alexia, Kveton, Branislav, Katariya, Sumeet, Caramanis, Constantine, Sanghavi, Sujay
We derive the first finite-time logarithmic Bayes regret upper bounds for Bayesian bandits. In Gaussian bandits, we obtain $O(c_\Delta \log n)$ and $O(c_h \log^2 n)$ bounds for an upper confidence bound algorithm, where $c_h$ and $c_\Delta$ are constants depending on the prior distribution and the gaps of random bandit instances sampled from it, respectively. The latter bound asymptotically matches the lower bound of Lai (1987). Our proofs are a major technical departure from prior works, while being simple and general. To show the generality of our techniques, we apply them to linear bandits. Our results provide insights on the value of prior in the Bayesian setting, both in the objective and as a side information given to the learner. They significantly improve upon existing $\tilde{O}(\sqrt{n})$ bounds, which have become standard in the literature despite the existing lower bounds.
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Cost-Efficient Online Decision Making: A Combinatorial Multi-Armed Bandit Approach
Rahbar, Arman, Åkerblom, Niklas, Chehreghani, Morteza Haghir
Online decision making plays a crucial role in numerous real-world applications. In many scenarios, the decision is made based on performing a sequence of tests on the incoming data points. However, performing all tests can be expensive and is not always possible. In this paper, we provide a novel formulation of the online decision making problem based on combinatorial multi-armed bandits and take the cost of performing tests into account. Based on this formulation, we provide a new framework for cost-efficient online decision making which can utilize posterior sampling or BayesUCB for exploration. We provide a rigorous theoretical analysis for our framework and present various experimental results that demonstrate its applicability to real-world problems.
- Europe > Sweden > Vaestra Goetaland > Gothenburg (0.05)
- North America > United States > Wisconsin (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Health & Medicine (0.97)
- Automobiles & Trucks (0.94)
- Transportation > Ground > Road (0.68)
- (2 more...)
A Contextual Combinatorial Semi-Bandit Approach to Network Bottleneck Identification
Hoseini, Fazeleh, Åkerblom, Niklas, Chehreghani, Morteza Haghir
Bottleneck identification is an essential task in network analysis with numerous important applications, such as traffic planning and road network management. For example, in a road network, the road segment with the highest cost is described as a path-specific bottleneck on a path between a source node and a destination node. The cost or weight can be defined according to specific criteria, such as travel time, energy consumption, etc. The aim is to find a path which minimizes the bottleneck among all paths connecting the source and destination nodes. Bottleneck identification can thus be characterized, in a given road network graph, as finding a path with the smallest maximum edge weight among the paths connecting the source node and the destination node, i.e., finding the minimax edge. By negating the edge weights, bottleneck identification can also be viewed as the widest path problem or the maximum capacity path problem [20].
- Europe > Sweden > Vaestra Goetaland > Gothenburg (0.05)
- Europe > Luxembourg > Luxembourg Canton > Luxembourg City (0.04)
- Transportation > Infrastructure & Services (1.00)
- Transportation > Ground > Road (1.00)
- Information Technology > Communications > Networks (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
- Information Technology > Data Science > Data Mining > Big Data (0.47)
Strategies for Safe Multi-Armed Bandits with Logarithmic Regret and Risk
Chen, Tianrui, Gangrade, Aditya, Saligrama, Venkatesh
We investigate a natural but surprisingly unstudied approach to the multi-armed bandit problem under safety risk constraints. Each arm is associated with an unknown law on safety risks and rewards, and the learner's goal is to maximise reward whilst not playing unsafe arms, as determined by a given threshold on the mean risk. We formulate a pseudo-regret for this setting that enforces this safety constraint in a per-round way by softly penalising any violation, regardless of the gain in reward due to the same. This has practical relevance to scenarios such as clinical trials, where one must maintain safety for each round rather than in an aggregated sense. We describe doubly optimistic strategies for this scenario, which maintain optimistic indices for both safety risk and reward. We show that schema based on both frequentist and Bayesian indices satisfy tight gap-dependent logarithmic regret bounds, and further that these play unsafe arms only logarithmically many times in total. This theoretical analysis is complemented by simulation studies demonstrating the effectiveness of the proposed schema, and probing the domains in which their use is appropriate.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)