Cost-Sensitive Exploration in Bayesian Reinforcement Learning

Mar-14-2024, 09:52:10 GMT–Neural Information Processing Systems

In this paper, we consider Bayesian reinforcement learning (BRL) where actions incur costs in addition to rewards, and thus exploration has to be constrained in terms of the expected total cost while learning to maximize the expected longterm total reward. In order to formalize cost-sensitive exploration, we use the constrained Markov decision process (CMDP) as the model of the environment, in which we can naturally encode exploration requirements using the cost function. We extend BEETLE, a model-based BRL method, for learning in the environment with cost constraints. We demonstrate the cost-sensitive exploration behaviour in a number of simulated problems.

artificial intelligence, cost constraint, machine learning, (17 more...)

Neural Information Processing Systems

Mar-14-2024, 09:52:10 GMT

Conferences PDF

Add feedback

Country:
- Europe > United Kingdom
  - England > Cambridgeshire > Cambridge (0.14)
- North America > United States
  - Massachusetts (0.14)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models
  - Directed Networks > Bayesian Learning (0.84)
  - Undirected Networks > Markov Models (0.71)