AITopics | bayesian reinforcement learning

Collaborating Authors

bayesian reinforcement learning

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Belief Aided Navigation using Bayesian Reinforcement Learning for Avoiding Humans in Blind Spots

Kim, Jinyeob, Kwak, Daewon, Rim, Hyunwoo, Kim, Donghan

arXiv.org Artificial IntelligenceMar-15-2024

Recent research on mobile robot navigation has focused on socially aware navigation in crowded environments. However, existing methods do not adequately account for human robot interactions and demand accurate location information from omnidirectional sensors, rendering them unsuitable for practical applications. In response to this need, this study introduces a novel algorithm, BNBRL+, predicated on the partially observable Markov decision process framework to assess risks in unobservable areas and formulate movement strategies under uncertainty. BNBRL+ consolidates belief algorithms with Bayesian neural networks to probabilistically infer beliefs based on the positional data of humans. It further integrates the dynamics between the robot, humans, and inferred beliefs to determine the navigation paths and embeds social norms within the reward function, thereby facilitating socially aware navigation. Through experiments in various risk laden scenarios, this study validates the effectiveness of BNBRL+ in navigating crowded environments with blind spots. The model's ability to navigate effectively in spaces with limited visibility and avoid obstacles dynamically can significantly improve the safety and reliability of autonomous vehicles.

algorithm, navigation, robot, (12 more...)

arXiv.org Artificial Intelligence

2403.10105

Country:

North America > United States > Washington > King County > Seattle (0.04)
North America > United States > California (0.04)
Asia > South Korea (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.82)

Add feedback

On-Robot Bayesian Reinforcement Learning for POMDPs

Nguyen, Hai, Katt, Sammie, Xiao, Yuchen, Amato, Christopher

arXiv.org Artificial IntelligenceJul-21-2023

Robot learning is often difficult due to the expense of gathering data. The need for large amounts of data can, and should, be tackled with effective algorithms and leveraging expert information on robot dynamics. Bayesian reinforcement learning (BRL), thanks to its sample efficiency and ability to exploit prior knowledge, is uniquely positioned as such a solution method. Unfortunately, the application of BRL has been limited due to the difficulties of representing expert knowledge as well as solving the subsequent inference problem. This paper advances BRL for robotics by proposing a specialized framework for physical systems. In particular, we capture this knowledge in a factored representation, then demonstrate the posterior factorizes in a similar shape, and ultimately formalize the model in a Bayesian framework. We then introduce a sample-based online solution method, based on Monte-Carlo tree search and particle filtering, specialized to solve the resulting model. This approach can, for example, utilize typical low-level robot simulators and handle uncertainty over unknown dynamics of the environment. We empirically demonstrate its efficiency by performing on-robot learning in two human-robot interaction tasks with uncertainty about human behavior, achieving near-optimal performance after only a handful of real-world episodes. A video of learned policies is at https://youtu.be/H9xp60ngOes.

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2307.11954

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Asia > Japan > Honshū > Kansai > Hyogo Prefecture > Kobe (0.04)

Genre: Research Report (0.64)

Industry:

Education (0.46)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Bayesian Reinforcement Learning for Automatic Voltage Control under Cyber-Induced Uncertainty

Sahu, Abhijeet, Davis, Katherine

arXiv.org Artificial IntelligenceMay-25-2023

Voltage control is crucial to large-scale power system reliable operation, as timely reactive power support can help prevent widespread outages. However, there is currently no built in mechanism for power systems to ensure that the voltage control objective to maintain reliable operation will survive or sustain the uncertainty caused under adversary presence. Hence, this work introduces a Bayesian Reinforcement Learning (BRL) approach for power system control problems, with focus on sustained voltage control under uncertainty in a cyber-adversarial environment. This work proposes a data-driven BRL-based approach for automatic voltage control by formulating and solving a Partially-Observable Markov Decision Problem (POMDP), where the states are partially observable due to cyber intrusions. The techniques are evaluated on the WSCC and IEEE 14 bus systems. Additionally, BRL techniques assist in automatically finding a threshold for exploration and exploitation in various RL techniques.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2305.16469

Genre: Research Report (0.50)

Industry:

Information Technology > Security & Privacy (1.00)
Energy > Power Industry (1.00)
Energy > Oil & Gas > Upstream (0.34)

Add feedback

Cost-Sensitive Exploration in Bayesian Reinforcement Learning

Neural Information Processing SystemsApr-6-2023, 12:23:31 GMT

In this paper, we consider Bayesian reinforcement learning (BRL) where actions incur costs in addition to rewards, and thus exploration has to be constrained in terms of the expected total cost while learning to maximize the expected long-term total reward. In order to formalize cost-sensitive exploration, we use the constrained Markov decision process (CMDP) as the model of the environment, in which we can naturally encode exploration requirements using the cost function. We extend BEETLE, a model-based BRL method, for learning in the environment with cost constraints. We demonstrate the cost-sensitive exploration behaviour in a number of simulated problems.

bayesian reinforcement learning, cost-sensitive exploration

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

Surveillance Evasion Through Bayesian Reinforcement Learning

Qi, Dongping, Bindel, David, Vladimirsky, Alexander

arXiv.org Artificial IntelligenceFeb-23-2023

We consider a task of surveillance-evading path-planning in a continuous setting. An Evader strives to escape from a 2D domain while minimizing the risk of detection (and immediate capture). The probability of detection is path-dependent and determined by the spatially inhomogeneous surveillance intensity, which is fixed but a priori unknown and gradually learned in the multi-episodic setting. We introduce a Bayesian reinforcement learning algorithm that relies on a Gaussian Process regression (to model the surveillance intensity function based on the information from prior episodes), numerical methods for Hamilton-Jacobi PDEs (to plan the best continuous trajectories based on the current model), and Confidence Bounds (to balance the exploration vs exploitation). We use numerical experiments and regret metrics to highlight the significant advantages of our approach compared to traditional graph-based algorithms of reinforcement learning.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2109.14811

Country:

South America > Brazil (0.04)
North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry: Law (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.84)

Add feedback

ROI-Constrained Bidding via Curriculum-Guided Bayesian Reinforcement Learning

Wang, Haozhe, Du, Chao, Fang, Panyan, Yuan, Shuo, He, Xuming, Wang, Liang, Zheng, Bo

arXiv.org Artificial IntelligenceJul-16-2022

Real-Time Bidding (RTB) is an important mechanism in modern online advertising systems. Advertisers employ bidding strategies in RTB to optimize their advertising effects subject to various financial requirements, especially the return-on-investment (ROI) constraint. ROIs change non-monotonically during the sequential bidding process, and often induce a see-saw effect between constraint satisfaction and objective optimization. While some existing approaches show promising results in static or mildly changing ad markets, they fail to generalize to highly dynamic ad markets with ROI constraints, due to their inability to adaptively balance constraints and objectives amidst non-stationarity and partial observability. In this work, we specialize in ROI-Constrained Bidding in non-stationary markets. Based on a Partially Observable Constrained Markov Decision Process, our method exploits an indicator-augmented reward function free of extra trade-off parameters and develops a Curriculum-Guided Bayesian Reinforcement Learning (CBRL) framework to adaptively control the constraint-objective trade-off in non-stationary ad markets. Extensive experiments on a large-scale industrial dataset with two problem settings reveal that CBRL generalizes well in both in-distribution and out-of-distribution data regimes, and enjoys superior learning efficiency and stability.

constraint, constraint-objective trade-off, reward function, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3534678.3539211

2206.0524

Country:

North America > United States > District of Columbia > Washington (0.05)
North America > United States > New York > New York County > New York City (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (0.82)

Industry:

Marketing (0.67)
Information Technology > Services (0.49)
Banking & Finance > Trading (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Cost-Sensitive Exploration in Bayesian Reinforcement Learning

Kim, Dongho, Kim, Kee-eung, Poupart, Pascal

Neural Information Processing SystemsFeb-15-2020, 00:27:00 GMT

bayesian reinforcement learning, cost-sensitive exploration

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

Bayesian Reinforcement Learning in Factored POMDPs

Katt, Sammie, Oliehoek, Frans, Amato, Christopher

arXiv.org Artificial IntelligenceNov-13-2018

Bayesian approaches provide a principled solution to the exploration-exploitation trade-off in Reinforcement Learning. Typical approaches, however, either assume a fully observable environment or scale poorly. This work introduces the Factored Bayes-Adaptive POMDP model, a framework that is able to exploit the underlying structure while learning the dynamics in partially observable systems. We also present a belief tracking method to approximate the joint posterior over state and model variables, and an adaptation of the Monte-Carlo Tree Search solution method, which together are capable of solving the underlying problem near-optimally. Our method is able to learn efficiently given a known factorization or also learn the factorization and the model parameters at the same time. We demonstrate that this approach is able to outperform current methods and tackle problems that were previously infeasible.

bayesian inference, pomdp, upstream oil & gas, (21 more...)

arXiv.org Artificial Intelligence

1811.05612

Genre: Research Report (1.00)

Industry: Energy > Oil & Gas > Upstream (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Benchmarking for Bayesian Reinforcement Learning

Castronovo, Michael, Ernst, Damien, Couetoux, Adrien, Fonteneau, Raphael

arXiv.org Artificial IntelligenceSep-14-2015

In the Bayesian Reinforcement Learning (BRL) setting, agents try to maximise the collected rewards while interacting with their environment while using some prior knowledge that is accessed beforehand. Many BRL algorithms have already been proposed, but even though a few toy examples exist in the literature, there are still no extensive or rigorous benchmarks to compare them. The paper addresses this problem, and provides a new BRL comparison methodology along with the corresponding open source library. In this methodology, a comparison criterion that measures the performance of algorithms on large sets of Markov Decision Processes (MDPs) drawn from some probability distributions is defined. In order to enable the comparison of non-anytime algorithms, our methodology also includes a detailed analysis of the computation time requirement of each algorithm. Our library is released with all source code and documentation: it includes three test problems, each of which has two different prior distributions, and seven state-of-the-art RL algorithms. Finally, our library is illustrated by comparing all the available algorithms and the results are discussed.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1371/journal.pone.0157088

1509.04064

Country: Europe > Belgium (0.14)

Genre: Research Report (0.81)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Reward Shaping for Model-Based Bayesian Reinforcement Learning

Kim, Hyeoneun (KAIST) | Lim, Woosang (KAIST) | Lee, Kanghoon (KAIST) | Noh, Yung-Kyun (KAIST) | Kim, Kee-Eung (KAIST)

AAAI ConferencesMar-6-2015

Bayesian reinforcement learning (BRL) provides a formal framework for optimal exploration-exploitation tradeoff in reinforcement learning. Unfortunately, it is generally intractable to find the Bayes-optimal behavior except for restricted cases. As a consequence, many BRL algorithms, model-based approaches in particular, rely on approximated models or real-time search methods. In this paper, we present potential-based shaping for improving the learning performance in model-based BRL. We propose a number of potential functions that are particularly well suited for BRL, and are domain-independent in the sense that they do not require any prior knowledge about the actual environment. By incorporating the potential function into real-time heuristic search, we show that we can significantly improve the learning performance in standard benchmark domains.

artificial intelligence, potential function, upstream oil & gas, (17 more...)

AAAI Conferences

Twenty-Ninth AAAI Conference on Artificial Intelligence

Country: North America > United States > Massachusetts (0.14)

Industry: Energy > Oil & Gas > Upstream (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.85)

Add feedback