Goto

Collaborating Authors

Improving Safety in Reinforcement Learning Using Model-Based Architectures and Human Intervention

arXiv.org Artificial Intelligence

Recent progress in AI and Reinforcement learning has shown great success in solving complex problems with high dimensional state spaces. However, most of these successes have been primarily in simulated environments where failure is of little or no consequence. Most real-world applications, however, require training solutions that are safe to operate as catastrophic failures are inadmissible especially when there is human interaction involved. Currently, Safe RL systems use human oversight during training and exploration in order to make sure the RL agent does not go into a catastrophic state. These methods require a large amount of human labor and it is very difficult to scale up. We present a hybrid method for reducing the human intervention time by combining model-based approaches and training a supervised learner to improve sample efficiency while also ensuring safety. We evaluate these methods on various grid-world environments using both standard and visual representations and show that our approach achieves better performance in terms of sample efficiency, number of catastrophic states reached as well as overall task performance compared to traditional model-free approaches


Responsibility and Blame: A Structural-Model Approach

Journal of Artificial Intelligence Research

Causality is typically treated an all-or-nothing concept; either A is a cause of B or it is not. We extend the definition of causality introduced by Halpern and Pearl [2004a] to take into account the degree of responsibility of A for B. For example, if someone wins an election 11-0, then each person who votes for him is less responsible for the victory than if he had won 6-5. We then define a notion of degree of blame, which takes into account an agent's epistemic state. Roughly speaking, the degree of blame of A for B is the expected degree of responsibility of A for B, taken over the epistemic state of an agent.


Agent-based Ecological Model Calibration - on the Edge of a New Approach

arXiv.org Artificial Intelligence

The purpose of this paper is to present a new approach to ecological model calibration -- an agent-based software. This agent works on three stages: 1- It builds a matrix that synthesizes the inter-variable relationships; 2- It analyses the steady-state sensitivity of different variables to different parameters; 3- It runs the model iteratively and measures model lack of fit, adequacy and reliability. Stage 3 continues until some convergence criteria are attained. At each iteration, the agent knows from stages 1 and 2, which parameters are most likely to produce the desired shift on predicted results.


Mixed-Initiative Systems for Collaborative Problem Solving

AI Magazine

Mixed-initiative systems are a popular approach to building intelligent systems that can collaborate naturally and effectively with people. But true collaborative behavior requires an agent to possess a number of capabilities, including reasoning, communication, planning, execution, and learning. We describe an integrated approach to the design and implementation of a collaborative problem-solving assistant based on a formal theory of joint activity and a declarative representation of tasks. This approach builds on prior work by us and by others on mixed-initiative dialogue and planning systems. We've all had the bad experience of working with someone who had to be told everything he or she needed to do (or worse, we had to do it for them).


A Probabilistic Approach for Maintaining Trust Based on Evidence

Journal of Artificial Intelligence Research

Leading agent-based trust models address two important needs. First, they show how an agent may estimate the trustworthiness of another agent based on prior interactions. Second, they show how agents may share their knowledge in order to cooperatively assess the trustworthiness of others. However, in real-life settings, information relevant to trust is usually obtained piecemeal, not all at once. Unfortunately, the problem of maintaining trust has drawn little attention.