AITopics | Bloembergen, Daan

Collaborating Authors

Bloembergen, Daan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Robust temporal difference learning for critical domains

Klima, Richard, Bloembergen, Daan, Kaisers, Michael, Tuyls, Karl

arXiv.org Machine LearningJan-23-2019

We present a new Q-function operator for temporal difference (TD) learning methods that explicitly encodes robustness against significant rare events (SRE) in critical domains. The operator, which we call the $\kappa$-operator, allows to learn a safe policy in a model-based fashion without actually observing the SRE. We introduce single- and multi-agent robust TD methods using the operator $\kappa$. We prove convergence of the operator to the optimal safe Q-function with respect to the model using the theory of Generalized Markov Decision Processes. In addition we prove convergence to the optimal Q-function of the original MDP given that the probability of SREs vanishes. Empirical evaluations demonstrate the superior performance of $\kappa$-based TD methods both in the early learning phase as well as in the final converged stage. In addition we show robustness of the proposed method to small model errors, as well as its applicability in a multi-agent context.

air transportation, artificial intelligence, probability, (18 more...)

arXiv.org Machine Learning

1901.08021

Country: North America > United States > Connecticut (0.14)

Genre: Research Report (0.82)

Industry:

Information Technology > Security & Privacy (1.00)
Energy (0.93)
Transportation > Air (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Lenient Multi-Agent Deep Reinforcement Learning

Palmer, Gregory, Tuyls, Karl, Bloembergen, Daan, Savani, Rahul

arXiv.org Artificial IntelligenceFeb-27-2018

Much of the success of single agent deep reinforcement learning (DRL) in recent years can be attributed to the use of experience replay memories (ERM), which allow Deep Q-Networks (DQNs) to be trained efficiently through sampling stored state transitions. However, care is required when using ERMs for multi-agent deep reinforcement learning (MA-DRL), as stored transitions can become outdated because agents update their policies in parallel [11]. In this work we apply leniency [23] to MA-DRL. Lenient agents map state-action pairs to decaying temperature values that control the amount of leniency applied towards negative policy updates that are sampled from the ERM. This introduces optimism in the value-function update, and has been shown to facilitate cooperation in tabular fully-cooperative multi-agent reinforcement learning problems. We evaluate our Lenient-DQN (LDQN) empirically against the related Hysteretic-DQN (HDQN) algorithm [22] as well as a modified version we call scheduled-HDQN, that uses average reward learning near terminal states. Evaluations take place in extended variations of the Coordinated Multi-Agent Object Transportation Problem (CMOTP) [8] which include fully-cooperative sub-tasks and stochastic rewards. We find that LDQN agents are more likely to converge to the optimal policy in a stochastic reward CMOTP compared to standard and scheduled-HDQN agents.

agent, artificial intelligence, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

1707.04402

Country:

Europe > Sweden (0.15)
Europe > United Kingdom (0.14)

Genre: Research Report > New Finding (0.67)

Industry:

Leisure & Entertainment (0.46)
Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.48)

Add feedback

Theory of Cooperation in Complex Social Networks

Ranjbar-Sahraei, Bijan (Maastricht University) | Ammar, Haitham Bou (University of Pennsylvania) | Bloembergen, Daan (Maastricht University) | Tuyls, Karl (University of Liverpool) | Weiss, Gerhard (Maastricht University)

AAAI ConferencesJul-14-2014

This paper presents a theoretical as well as empirical study on the evolution of cooperation on complex social networks, following the continuous action iterated prisoner's dilemma (CAIPD) model. In particular, convergence to network-wide agreement is proven for both evolutionary networks with fixed interaction dynamics, as well as for coevolutionary networks where these dynamics change over time. Moreover, an extension to the CAIPD model is proposed that allows to model influence on the evolution of cooperation in social networks. As such, this work contributes to a better understanding of behavioral change on social networks, and provides a first step towards their active control.

agreement, artificial intelligence, social media, (16 more...)

AAAI Conferences

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country:

North America > United States (0.14)
Europe > United Kingdom (0.14)
Europe > Netherlands (0.14)

Industry: Information Technology > Services (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.69)

Add feedback

Telepresence Robots as a Research Platform for AI

AAAI ConferencesMar-21-2013

Recently, various commercial telepresence robots have become available to the broader public. Here, we present the telepresence domain as a research platform for (re-)integrating AI. With MITRO: Maastricht Intelligent Telepresence RObot, we built a low-cost working prototype of a robot system specifically designed for augmented and autonomous telepresence. Telepresence robots can be deployed in a wide range of application domains, and augmented presence with assisted control can greatly improve the experience for the user. The research domains that we are focusing on are human robot interaction, navigation and perception.

research platform, telepresence robot

AAAI Conferences

2013 AAAI Spring Symposium Series

Country: Europe > Netherlands > Limburg > Maastricht (0.24)

Technology:

Information Technology > Communications > Collaboration (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback