AITopics | Faisal, Aldo

Collaborating Authors

Faisal, Aldo

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Bag of Policies for Distributional Deep Exploration

Nachkov, Asen, Li, Luchen, Luise, Giulia, Valdettaro, Filippo, Faisal, Aldo

arXiv.org Artificial IntelligenceAug-3-2023

Efficient exploration in complex environments remains Distributional RL (DiRL) has rapidly established its place a major challenge for reinforcement learning among reinforcement learning (RL) algorithms Bellemare (RL). Compared to previous Thompson samplinginspired et al. [2017] as a powerful improvement over nondistributional mechanisms that enable temporally extended value-based counterparts Lyle et al. [2019]. In exploration, i.e., deep exploration, we focus DiRL, the agent does not learn a single summary statistic of on deep exploration in distributional RL. We develop the return for each state-action pair, but instead learns the here a general purpose approach, Bag of Policies whole return distribution. The agent's behaviour is being (BoP), that can be built on top of any return evaluated for multiple possible consequences which in turn distribution estimator by maintaining a population affect the policy update. While this does lead to more stable of its copies. BoP consists of an ensemble of multiple learning and better performance Lyle et al. [2019], it does heads that are updated independently. During not itself change the way actions are selected; as distributional training, each episode is controlled by only one of extensions to value-based RL, in C51 Bellemare et al. the heads and the collected state-action pairs are [2017], QR-DQN Dabney et al. [2018b] the agent still takes used to update all heads off-policy, leading to distinct actions according to the mean of the estimated return distributions learning signals for each head which diversify in each state-action pair.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2308.01759

Country: Europe > Germany > Bavaria (0.14)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Action Grammars: A Cognitive Model for Learning Temporal Abstractions

Lange, Robert Tjarko, Faisal, Aldo

arXiv.org Artificial IntelligenceJul-29-2019

Hierarchical Reinforcement Learning algorithms have successfully been applied to temporal credit assignment problems with sparse reward signals. However, state-of- the-art algorithms require manual specification of sub-task structures, a sample inefficient exploration phase and lack semantic interpretability. Human infants, on the other hand, efficiently detect hierarchical substructures induced by their surroundings. In this work we propose a cognitive-inspired Reinforcement Learning architecture which uses grammar induction to identify sub-goal policies. More specifically, by treating an on-policy trajectory as a sentence sampled from the policy-conditioned language of the environment, we identify hierarchical constituents with the help of unsupervised grammatical inference. The resulting set of temporal abstractions is called action grammars (Pastra & Aloimonos, 2012) and can be used to enable efficient imitation, transfer and online learning.

artificial intelligence, educational setting, grammar, (18 more...)

arXiv.org Artificial Intelligence

1907.12477

Country:

Europe (0.47)
North America > United States > California > Alameda County > Berkeley (0.14)

Genre: Research Report (0.64)

Industry: Education > Educational Setting (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

RLOC: Neurobiologically Inspired Hierarchical Reinforcement Learning Algorithm for Continuous Control of Nonlinear Dynamical Systems

Abramova, Ekaterina, Dickens, Luke, Kuhn, Daniel, Faisal, Aldo

arXiv.org Machine LearningMar-7-2019

Nonlinear optimal control problems are often solved with numerical methods that require knowledge of system's dynamics which may be difficult to infer, and that carry a large computational cost associated with iterative calculations. We present a novel neurobiologically inspired hierarchical learning framework, Reinforcement Learning Optimal Control, which operates on two levels of abstraction and utilises a reduced number of controllers to solve nonlinear systems with unknown dynamics in continuous state and action spaces. Our approach is inspired by research at two levels of abstraction: first, at the level of limb coordination human behaviour is explained by linear optimal feedback control theory. Second, in cognitive tasks involving learning symbolic level action selection, humans learn such problems using model-free and model-based reinforcement learning algorithms. We propose that combining these two levels of abstraction leads to a fast global solution of nonlinear control problems using reduced number of controllers. Our framework learns the local task dynamics from naive experience and forms locally optimal infinite horizon Linear Quadratic Regulators which produce continuous low-level control. A top-level reinforcement learner uses the controllers as actions and learns how to best combine them in state space while maximising a long-term reward. A single optimal control objective function drives high-level symbolic learning by providing training signals on desirability of each selected controller. We show that a small number of locally optimal linear controllers are able to solve global nonlinear control problems with unknown dynamics when combined with a reinforcement learner in this hierarchical framework. Our algorithm competes in terms of computational cost and solution quality with sophisticated control algorithms and we illustrate this with solutions to benchmark problems.

controller, deep learning, neural network, (24 more...)

arXiv.org Machine Learning

1903.03064

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Energy (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Improving Sepsis Treatment Strategies by Combining Deep and Kernel-Based Reinforcement Learning

Peng, Xuefeng, Ding, Yi, Wihl, David, Gottesman, Omer, Komorowski, Matthieu, Lehman, Li-wei H., Ross, Andrew, Faisal, Aldo, Doshi-Velez, Finale

arXiv.org Machine LearningJan-15-2019

Sepsis is the leading cause of mortality in the ICU. It is challenging to manage because individual patients respond differently to treatment. Thus, tailoring treatment to the individual patient is essential for the best outcomes. In this paper, we take steps toward this goal by applying a mixture-of-experts framework to personalize sepsis treatment. The mixture model selectively alternates between neighbor-based (kernel) and deep reinforcement learning (DRL) experts depending on patient's current history. On a large retrospective cohort, this mixture-based approach outperforms physician, kernel only, and DRL-only experts.

artificial intelligence, health & medicine, reinforcement learning, (20 more...)

arXiv.org Machine Learning

1901.0467

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Behaviour Policy Estimation in Off-Policy Policy Evaluation: Calibration Matters

Raghu, Aniruddh, Gottesman, Omer, Liu, Yao, Komorowski, Matthieu, Faisal, Aldo, Doshi-Velez, Finale, Brunskill, Emma

arXiv.org Machine LearningJul-10-2018

In this work, we consider the problem of estimating a behaviour policy for use in Off-Policy Policy Evaluation (OPE) when the true behaviour policy is unknown. Via a series of empirical studies, we demonstrate how accurate OPE is strongly dependent on the calibration of estimated behaviour policy models: how precisely the behaviour policy is estimated from data. We show how powerful parametric models such as neural networks can result in highly uncalibrated behaviour policy models on a real-world medical dataset, and illustrate how a simple, non-parametric, k-nearest neighbours model produces better calibrated behaviour policy estimates and can be used to obtain superior importance sampling-based OPE estimates.

artificial intelligence, behaviour policy, health & medicine, (13 more...)

arXiv.org Machine Learning

1807.01066

Country: Europe > Sweden (0.14)

Genre: Research Report (0.65)

Industry: Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.31)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

Evaluating Reinforcement Learning Algorithms in Observational Health Settings

Gottesman, Omer, Johansson, Fredrik, Meier, Joshua, Dent, Jack, Lee, Donghun, Srinivasan, Srivatsan, Zhang, Linying, Ding, Yi, Wihl, David, Peng, Xuefeng, Yao, Jiayu, Lage, Isaac, Mosch, Christopher, Lehman, Li-wei H., Komorowski, Matthieu, Komorowski, Matthieu, Faisal, Aldo, Celi, Leo Anthony, Sontag, David, Doshi-Velez, Finale

arXiv.org Machine LearningMay-30-2018

Much attention has been devoted recently to the development of machine learning algorithms with the goal of improving treatment policies in healthcare. Reinforcement learning (RL) is a sub-field within machine learning that is concerned with learning how to make sequences of decisions so as to optimize long-term effects. Already, RL algorithms have been proposed to identify decision-making strategies for mechanical ventilation, sepsis management and treatment of schizophrenia. However, before implementing treatment policies learned by black-box algorithms in high-stakes clinical decision problems, special care must be taken in the evaluation of these policies. In this document, our goal is to expose some of the subtleties associated with evaluating RL algorithms in healthcare. We aim to provide a conceptual starting point for clinical and computational researchers to ask the right questions when designing and evaluating algorithms for new ways of treating patients. In the following, we describe how choices about how to summarize a history, variance of statistical estimators, and confounders in more ad-hoc measures can result in unreliable, even misleading estimates of the quality of a treatment policy. We also provide suggestions for mitigating these effects---for while there is much promise for mining observational health data to uncover better treatment policies, evaluation must be performed thoughtfully.

artificial intelligence, evaluation, health & medicine, (20 more...)

arXiv.org Machine Learning

1805.12298

Country: Europe (0.14)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Representation Balancing MDPs for Off-Policy Policy Evaluation

Liu, Yao, Gottesman, Omer, Raghu, Aniruddh, Komorowski, Matthieu, Faisal, Aldo, Doshi-Velez, Finale, Brunskill, Emma

arXiv.org Artificial IntelligenceMay-23-2018

We study the problem of off-policy policy evaluation (OPPE) in RL. In contrast to prior work, we consider how to estimate both the individual policy value and average policy value accurately. We draw inspiration from recent work in causal reasoning, and propose a new finite sample generalization error bound for value estimates from MDP models. Using this upper bound as an objective, we develop a learning algorithm of an MDP model with a balanced representation, and show that our approach can yield substantially lower MSE in a common synthetic domain and on a challenging real-world sepsis management problem.

artificial intelligence, estimator, health & medicine, (18 more...)

arXiv.org Artificial Intelligence

1805.09044

Genre: Research Report (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.92)

Add feedback