AITopics | Brys, Tim

Collaborating Authors

Brys, Tim

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Conceptual Framework for Externally-influenced Agents: An Assisted Reinforcement Learning Review

Bignold, Adam, Cruz, Francisco, Taylor, Matthew E., Brys, Tim, Dazeley, Richard, Vamplew, Peter, Foale, Cameron

arXiv.org Artificial IntelligenceJul-3-2020

A long-term goal of reinforcement learning agents is to be able to perform tasks in complex real-world scenarios. The use of external information is one way of scaling agents to more complex problems. However, there is a general lack of collaboration or interoperability between different approaches using external information. In this work, we propose a conceptual framework and taxonomy for assisted reinforcement learning, aimed at fostering such collaboration by classifying and comparing various methods that use external information in the learning process. The proposed taxonomy details the relationship between the external information source and the learner agent, highlighting the process of information decomposition, structure, retention, and how it can be used to influence agent learning. As well as reviewing state-of-the-art methods, we identify current streams of reinforcement learning that use external information in order to improve the agent's performance and its decision-making process. These include heuristic reinforcement learning, interactive reinforcement learning, learning from demonstration, transfer learning, and learning from multiple sources, among others. These streams of reinforcement learning operate with the shared objective of scaffolding the learner agent. Lastly, we discuss further possibilities for future work in the field of assisted reinforcement learning systems.

agent, soccer, survey article, (19 more...)

arXiv.org Artificial Intelligence

2007.01544

Country: North America > Canada > Alberta (0.28)

Genre: Research Report (0.83)

Industry:

Leisure & Entertainment > Games (1.00)
Education (1.00)
Leisure & Entertainment > Sports > Soccer (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Directed Policy Gradient for Safe Reinforcement Learning with Human Advice

Plisnier, Hélène, Steckelmacher, Denis, Brys, Tim, Roijers, Diederik M., Nowé, Ann

arXiv.org Machine LearningAug-13-2018

Many currently deployed Reinforcement Learning agents work in an environment shared with humans, be them co-workers, users or clients. It is desirable that these agents adjust to people's preferences, learn faster thanks to their help, and act safely around them. We argue that most current approaches that learn from human feedback are unsafe: rewarding or punishing the agent a-posteriori cannot immediately prevent it from wrong-doing. In this paper, we extend Policy Gradient to make it robust to external directives, that would otherwise break the fundamentally on-policy nature of Policy Gradient. Our technique, Directed Policy Gradient (DPG), allows a teacher or backup policy to override the agent before it acts undesirably, while allowing the agent to leverage human advice or directives to learn faster. Our experiments demonstrate that DPG makes the agent learn much faster than reward-based approaches, while requiring an order of magnitude less advice.

artificial intelligence, policy gradient, reinforcement learning, (15 more...)

arXiv.org Machine Learning

1808.04096

Country: Europe (0.68)

Genre: Research Report (0.65)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Adapting to Concept Drift in Credit Card Transaction Data Streams Using Contextual Bandits and Decision Trees

Soemers, Dennis J. N. J. (Vrije Universiteit Brussel) | Brys, Tim (Vrije Universiteit Brussel) | Driessens, Kurt (Maastricht University) | Winands, Mark H. M. (Maastricht University) | Nowé, Ann (Vrije Universiteit Brussel)

AAAI ConferencesFeb-8-2018

Credit card transactions predicted to be fraudulent by automated detection systems are typically handed over to human experts for verification. To limit costs, it is standard practice to select only the most suspicious transactions for investigation. We claim that a trade-off between exploration and exploitation is imperative to enable adaptation to changes in behavior (concept drift). Exploration consists of the selection and investigation of transactions with the purpose of improving predictive models, and exploitation consists of investigating transactions detected to be suspicious. Modeling the detection of fraudulent transactions as rewarding, we use an incremental Regression Tree learner to create clusters of transactions with similar expected rewards. This enables the use of a Contextual Multi-Armed Bandit (CMAB) algorithm to provide the exploration/exploitation trade-off. We introduce a novel variant of a CMAB algorithm that makes use of the structure of this tree, and use Semi-Supervised Learning to grow the tree using unlabeled data. The approach is evaluated on a real dataset and data generated by a simulator that adds concept drift by adapting the behavior of fraudsters to avoid detection. It outperforms frequently used offline models in terms of cumulative rewards, in particular in the presence of concept drift.

law enforcement, transaction, upstream oil & gas, (22 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country:

Europe (0.28)
North America > United States > Wisconsin > Dane County > Madison (0.14)

Industry:

Banking & Finance > Credit (1.00)
Law Enforcement & Public Safety > Fraud (0.91)
Energy > Oil & Gas > Upstream (0.55)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Dimensionality Reduced Reinforcement Learning for Assistive Robots

Curran, William (Oregon State University) | Brys, Tim (Vrije Universiteit Brussel) | Aha, David (Navy Center for Applied Research in AI) | Taylor, Matthew (Washington State University) | Smart, William D. (Oregon State University)

AAAI ConferencesNov-19-2016

State-of-the-art personal robots need to perform complex manipulation tasks to be viable in assistive scenarios. However, many of these robots, like the PR2, use manipulators with high degrees-of-freedom, and the problem is made worse in bimanual manipulation tasks. The complexity of these robots lead to large dimensional state spaces, which are difficult to learn in. We reduce the state space by using demonstrations to discover a representative low-dimensional hyperplane in which to learn. This allows the agent to converge quickly to a good policy. We call this Dimensionality Reduced Reinforcement Learning (DRRL). However, when performing dimensionality reduction, not all dimensions can be fully represented. We extend this work by first learning in a single dimension, and then transferring that knowledge to a higher-dimensional hyperplane. By using our Iterative DRRL (IDRRL) framework with an existing learning algorithm, the agent converges quickly to a better policy by iterating to increasingly higher dimensions. IDRRL is robust to demonstration quality and can learn efficiently using few demonstrations. We show that adding IDRRL to the Q-Learning algorithm leads to faster learning on a set of mountain car tasks and the robot swimmers problem.

assistive robot, dimensionality reduced reinforcement learning

AAAI Conferences

2016 AAAI Fall Symposium Series

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Combining Multiple Correlated Reward and Shaping Signals by Measuring Confidence

Brys, Tim (Vrije Universiteit Brussel) | Nowé, Ann (Vrije Universiteit Brussel) | Kudenko, Daniel (University of York) | Taylor, Matthew E. (Washington State University)

AAAI ConferencesJul-14-2014

Multi-objective problems with correlated objectives are a class of problems that deserve specific attention. In contrast to typical multi-objective problems, they do not require the identification of trade-offs between the objectives, as (near-) optimal solutions for any objective are (near-) optimal for every objective. Intelligently combining the feedback from these objectives, instead of only looking at a single one, can improve optimization. This class of problems is very relevant in reinforcement learning, as any single-objective reinforcement learning problem can be framed as such a multi-objective problem using multiple reward shaping functions. After discussing this problem class, we propose a solution technique for such reinforcement learning problems, called adaptive objective selection. This technique makes a temporal difference learner estimate the Q-function for each objective in parallel, and introduces a way of measuring confidence in these estimates. This confidence metric is then used to choose which objective's estimates to use for action selection. We show significant improvements in performance over other plausible techniques on two problem domains. Finally, we provide an intuitive analysis of the technique's decisions, yielding insights into the nature of the problems being solved.

artificial intelligence, objective, reinforcement learning, (16 more...)

AAAI Conferences

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country: North America > United States > Washington (0.14)

Genre: Research Report > Experimental Study (0.69)

Industry: Education > Focused Education > Special Education (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Reinforcement Learning on Multiple Correlated Signals

Brys, Tim (Vrije Universiteit Brussel) | Nowé, Ann (Vrije Universiteit Brussel)

AAAI ConferencesJul-14-2014

As potential-based reward shaping functions (heuristic signals conflicts may exist between objectives, there is in general guiding exploration) (Brys et al. 2014a). We prove that this a need to identify (a set of) tradeoff solutions. The set modification preserves the total order, and thus also optimality, of optimal, i.e. non-dominated, incomparable solutions is of policies, mainly relying on the results by Ng, Harada, called the Pareto-front. We identify multi-objective problems and Russell (1999). This insight - that any MDP can be with correlated objectives (CMOP) as a specific subclass framed as a CMOMDP - significantly increases the importance of multi-objective problems, defined to contain those of this problem class, as well as techniques developed MOPs whose Pareto-front is so limited that one can barely for it, as these could be used to solve regular single-objective speak of tradeoffs (Brys et al. 2014b). By consequence, MDPs faster and better, provided several meaningful shapings the system designer does not care about which of the very can be devised.

artificial intelligence, objective, reinforcement learning, (14 more...)

AAAI Conferences

Twenty-Eighth AAAI Conference on Artificial Intelligence

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.56)

Add feedback

Improving Convergence of CMA-ES Through Structure-Driven Discrete Recombination

Brys, Tim (Vrije Universiteit Brussel) | Nowé, Ann (Vrije Universiteit Brussel)

AAAI ConferencesJul-21-2012

Evolutionary Strategies (ES) are a class of continuous optimization algorithms that have proven to perform very well on hard optimization problems. Whereas in earlier literature, both intermediate and discrete recombination operators were used, we now see that most ES, e.g. CMA-ES, use only intermediate recombination. While CMA-ES is considered state-of-the-art in continuous optimization, we believe that reintroducing discrete recombination can improve the algorithms' ability to escape local optima. Specifically, we look at using information on the problem's structure to create building blocks for recombination.

artificial intelligence, optimization problem, recombination, (16 more...)

AAAI Conferences

Twenty-Sixth AAAI Conference on Artificial Intelligence

Country:

North America > United States (0.15)
Europe > Belgium (0.15)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Add feedback