AITopics | bayesucb

Collaborating Authors

bayesucb

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Finite-Time Logarithmic Bayes Regret Upper Bounds

Neural Information Processing SystemsApr-24-2026, 19:33:57 GMT

We derive the first finite-time logarithmic Bayes regret upper bounds for Bayesian bandits. In a multi-armed bandit, we obtain O(c logn)and O(ch log2 n)upper bounds for an upper confidence bound algorithm, where ch and c are constants depending on the prior distribution and the gaps of bandit instances sampled from it, respectively. The latter bound asymptotically matches the lower bound of Lai (1987). Our proofs are a major technical departure from prior works, while being simple and general. To show the generality of our techniques, we apply them to linear bandits. Our results provide insights on the value of prior in the Bayesian setting, both in the objective and as a side information given to the learner. They significantly improve upon existing O( n)bounds, which have become standard in the literature despite the logarithmic lower bound of Lai (1987).

bandit, data mining, machine learning, (22 more...)

Neural Information Processing Systems

Country: North America > United States > Texas (0.14)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Finite-Time Logarithmic Bayes Regret Upper Bounds

Atsidakou, Alexia, Kveton, Branislav, Katariya, Sumeet, Caramanis, Constantine, Sanghavi, Sujay

arXiv.org Machine LearningNov-3-2023

We derive the first finite-time logarithmic Bayes regret upper bounds for Bayesian bandits. In Gaussian bandits, we obtain $O(c_\Delta \log n)$ and $O(c_h \log^2 n)$ bounds for an upper confidence bound algorithm, where $c_h$ and $c_\Delta$ are constants depending on the prior distribution and the gaps of random bandit instances sampled from it, respectively. The latter bound asymptotically matches the lower bound of Lai (1987). Our proofs are a major technical departure from prior works, while being simple and general. To show the generality of our techniques, we apply them to linear bandits. Our results provide insights on the value of prior in the Bayesian setting, both in the objective and as a side information given to the learner. They significantly improve upon existing $\tilde{O}(\sqrt{n})$ bounds, which have become standard in the literature despite the existing lower bounds.

bandit, data mining, machine learning, (21 more...)

arXiv.org Machine Learning

2306.09136

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Data Science > Data Mining > Big Data (0.47)

Add feedback

Cost-Efficient Online Decision Making: A Combinatorial Multi-Armed Bandit Approach

Rahbar, Arman, Åkerblom, Niklas, Chehreghani, Morteza Haghir

arXiv.org Artificial IntelligenceAug-21-2023

Online decision making plays a crucial role in numerous real-world applications. In many scenarios, the decision is made based on performing a sequence of tests on the incoming data points. However, performing all tests can be expensive and is not always possible. In this paper, we provide a novel formulation of the online decision making problem based on combinatorial multi-armed bandits and take the cost of performing tests into account. Based on this formulation, we provide a new framework for cost-efficient online decision making which can utilize posterior sampling or BayesUCB for exploration. We provide a rigorous theoretical analysis for our framework and present various experimental results that demonstrate its applicability to real-world problems.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2308.10699

Country:

Europe > Sweden > Vaestra Goetaland > Gothenburg (0.05)
North America > United States > Wisconsin (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine (0.97)
Automobiles & Trucks (0.94)
Transportation > Ground > Road (0.68)
(2 more...)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

A Contextual Combinatorial Semi-Bandit Approach to Network Bottleneck Identification

Hoseini, Fazeleh, Åkerblom, Niklas, Chehreghani, Morteza Haghir

arXiv.org Artificial IntelligenceMar-5-2023

Bottleneck identification is an essential task in network analysis with numerous important applications, such as traffic planning and road network management. For example, in a road network, the road segment with the highest cost is described as a path-specific bottleneck on a path between a source node and a destination node. The cost or weight can be defined according to specific criteria, such as travel time, energy consumption, etc. The aim is to find a path which minimizes the bottleneck among all paths connecting the source and destination nodes. Bottleneck identification can thus be characterized, in a given road network graph, as finding a path with the smallest maximum edge weight among the paths connecting the source node and the destination node, i.e., finding the minimax edge. By negating the edge weights, bottleneck identification can also be viewed as the widest path problem or the maximum capacity path problem [20].

artificial intelligence, data mining, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2206.08144

Country:

Europe > Sweden > Vaestra Goetaland > Gothenburg (0.05)
Europe > Luxembourg > Luxembourg Canton > Luxembourg City (0.04)

Genre: Research Report (0.40)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Data Science > Data Mining > Big Data (0.47)

Add feedback

Strategies for Safe Multi-Armed Bandits with Logarithmic Regret and Risk

Chen, Tianrui, Gangrade, Aditya, Saligrama, Venkatesh

arXiv.org Machine LearningApr-1-2022

We investigate a natural but surprisingly unstudied approach to the multi-armed bandit problem under safety risk constraints. Each arm is associated with an unknown law on safety risks and rewards, and the learner's goal is to maximise reward whilst not playing unsafe arms, as determined by a given threshold on the mean risk. We formulate a pseudo-regret for this setting that enforces this safety constraint in a per-round way by softly penalising any violation, regardless of the gain in reward due to the same. This has practical relevance to scenarios such as clinical trials, where one must maintain safety for each round rather than in an aggregated sense. We describe doubly optimistic strategies for this scenario, which maintain optimistic indices for both safety risk and reward. We show that schema based on both frequentist and Bayesian indices satisfy tight gap-dependent logarithmic regret bounds, and further that these play unsafe arms only logarithmically many times in total. This theoretical analysis is complemented by simulation studies demonstrating the effectiveness of the proposed schema, and probing the domains in which their use is appropriate.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Machine Learning

2204.00706

Country: