AITopics | Whiteson, Shimon

Collaborating Authors

Whiteson, Shimon

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Learning to Communicate with Deep Multi-Agent Reinforcement Learning

Foerster, Jakob, Assael, Ioannis Alexandros, Freitas, Nando de, Whiteson, Shimon

Neural Information Processing SystemsDec-31-2016

We consider the problem of multiple agents sensing and acting in environments with the goal of maximising their shared utility. In these environments, agents must learn communication protocols in order to share information that is needed to solve the tasks. By embracing deep neural networks, we are able to demonstrate end-to-end learning of protocols in complex environments inspired by communication riddles and multi-agent computer vision problems with partial observability. We propose two approaches for learning in these domains: Reinforced Inter-Agent Learning (RIAL) and Differentiable Inter-Agent Learning (DIAL). The former uses deep Q-learning, while the latter exploits the fact that, during learning, agents can backpropagate error derivatives through (noisy) communication channels. Hence, this approach uses centralised learning but decentralised execution. Our experiments introduce new environments for studying the learning of communication protocols and present a set of engineering innovations that are essential for success in these domains.

agent, deep learning, neural network, (15 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom > England > Oxfordshire > Oxford (0.28)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Probably Approximately Correct Greedy Maximization

Satsangi, Yash, Whiteson, Shimon, Oliehoek, Frans A.

arXiv.org Machine LearningFeb-25-2016

Submodular function maximization finds application in a variety of real-world decision-making problems. However, most existing methods, based on greedy maximization, assume it is computationally feasible to evaluate F, the function being maximized. Unfortunately, in many realistic settings F is too expensive to evaluate exactly even once. We present probably approximately correct greedy maximization, which requires access only to cheap anytime confidence bounds on F and uses them to prune elements. We show that, with high probability, our method returns an approximately optimal set. We propose novel, cheap confidence bounds for conditional entropy, which appears in many common choices of F and for which it is difficult to find unbiased or bounded estimates. Finally, results on a real-world dataset from a multi-camera tracking system in a shopping mall demonstrate that our approach performs comparably to existing methods, but at a fraction of the computational cost.

artificial intelligence, machine learning, maximization, (15 more...)

arXiv.org Machine Learning

1602.0786

Country: Europe (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Copeland Dueling Bandits

Zoghi, Masrour, Karnin, Zohar S., Whiteson, Shimon, Rijke, Maarten de

Neural Information Processing SystemsDec-31-2015

A version of the dueling bandit problem is addressed in which a Condorcet winner may not exist. Two algorithms are proposed that instead seek to minimize regret with respect to the Copeland winner, which, unlike the Condorcet winner, is guaranteed to exist. The first, Copeland Confidence Bound (CCB), is designed for small numbers of arms, while the second, Scalable Copeland Bandits (SCB), works better for large-scale problems. We provide theoretical results bounding the regret accumulated by CCB and SCB, both substantially improving existing results. Such existing results either offer bounds of the form O(K log T) but require restrictive assumptions, or offer bounds of the form O(K^2 log T) without requiring such assumptions. Our results offer the best of both worlds: O(K log T) bounds without restrictive assumptions.

algorithm, artificial intelligence, big data, (19 more...)

Neural Information Processing Systems

Country:

Europe > Netherlands (0.15)
North America > United States (0.14)
Europe > United Kingdom > England (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.70)
Information Technology > Data Science > Data Mining > Big Data (0.54)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.46)

Add feedback

Exploiting Submodular Value Functions for Faster Dynamic Sensor Selection

Satsangi, Yash (University of Amsterdam) | Whiteson, Shimon (University of Amsterdam) | Oliehoek, Frans A. (University of Amsterdam)

AAAI ConferencesMar-6-2015

A key challenge in the design of multi-sensor systems is the efficient allocation of scarce resources such as bandwidth, CPU cycles, and energy, leading to the dynamic sensor selection problem in which a subset of the available sensors must be selected at each timestep. While partially observable Markov decision processes (POMDPs) provide a natural decision-theoretic model for this problem, the computational cost of POMDP planning grows exponentially in the number of sensors, making it feasible only for small problems. We propose a new POMDP planning method that uses greedy maximization to greatly improve scalability in the number of sensors. We show that, under certain conditions, the value function of a dynamic sensor selection POMDP is submodular and use this result to bound the error introduced by performing greedy maximization. Experimental results on a real-world dataset from a multi-camera tracking system in a shopping mall show it achieves similar performance to existing methods but incurs only a fraction of the computational cost, leading to much better scalability in the number of cameras.

artificial intelligence, machine learning, submodular, (15 more...)

AAAI Conferences

Twenty-Ninth AAAI Conference on Artificial Intelligence

Country: Europe > Netherlands (0.14)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Towards Personalised Gaming via Facial Expression Recognition

Blom, Paris Mavromoustakos (University of Amsterdam) | Bakkes, Sander (University of Amsterdam) | Tan, Chek Tien (University of Technology Sydney) | Whiteson, Shimon (University of Amsterdam) | Roijers, Diederik (University of Amsterdam) | Valenti, Roberto (University of Amsterdam) | Gevers, Theo (University of Amsterdam)

AAAI ConferencesSep-29-2014

In this paper we propose an approach for personalising the space in which a game is played (i.e., levels) dependent on classifications of the user's facial expression — to the end of tailoring the affective game experience to the individual user. Our approach is aimed at online game personalisation, i.e., the game experience is personalised during actual play of the game. A key insight of this paper is that game personalisation techniques can leverage novel computer vision-based techniques to unobtrusively infer player experiences automatically based on facial expression analysis. Specifically, to the end of tailoring the affective game experience to the individual user, in this paper we (1) leverage the proven InSight facial expression recognition SDK as a model of the user's affective state InSight, and (2) employ this model for guiding the online game personalisation process. User studies that validate the game personalisation approach in the actual video game Infinite Mario Bros. reveal that it provides an effective basis for converging to an appropriate affective state for the individual human player.

facial expression recognition, personalised gaming

AAAI Conferences

Tenth Artificial Intelligence and Interactive Digital Entertainment Conference

Industry: Leisure & Entertainment > Games > Computer Games (0.73)

Technology: Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)

Add feedback

Bounded Approximations for Linear Multi-Objective Planning Under Uncertainty

Roijers, Diederik Marijn (University of Amsterdam) | Scharpff, Joris (Delft University of Technology) | Spaan, Matthijs (Delft University of Technology) | Oliehoek, Frans (University of Amsterdam) | Weerdt, Mathijs de (Delft University of Technology) | Whiteson, Shimon (University of Amsterdam)

AAAI ConferencesJun-9-2014

Planning under uncertainty poses a complex problem in which multiple objectives often need to be balanced. When dealing with multiple objectives, it is often assumed that the relative importance of the objectives is known a priori. However, in practice human decision makers often find it hard to specify such preferences, and would prefer a decision support system that presents a range of possible alternatives. We propose two algorithms for computing these alternatives for the case of linearly weighted objectives. First, we propose an anytime method, approximate optimistic linear support (AOLS), that incrementally builds up a complete set of ε-optimal plans, exploiting the piecewise linear and convex shape of the value function. Second, we propose an approximate anytime method, scalarised sample incremental improvement (SSII), that employs weight sampling to focus on the most interesting regions in weight space, as suggested by a prior over preferences. We show empirically that our methods are able to produce (near-)optimal alternative sets orders of magnitude faster than existing techniques.

bounded approximation, linear multi-objective planning

AAAI Conferences

Twenty-Fourth International Conference on Automated Planning and Scheduling

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)

Add feedback

Exploiting Agent and Type Independence in Collaborative Graphical Bayesian Games

Oliehoek, Frans A., Whiteson, Shimon, Spaan, Matthijs T. J.

arXiv.org Artificial IntelligenceApr-25-2014

Efficient collaborative decision making is an important challenge for multiagent systems. Finding optimal joint actions is especially challenging when each agent has only imperfect information about the state of its environment. Such problems can be modeled as collaborative Bayesian games in which each agent receives private information in the form of its type. However, representing and solving such games requires space and computation time exponential in the number of agents. This article introduces collaborative graphical Bayesian games (CGBGs), which facilitate more efficient collaborative decision making by decomposing the global payoff function as the sum of local payoff functions that depend on only a few agents. We propose a framework for the efficient solution of CGBGs based on the insight that they posses two different types of independence, which we call agent independence and type independence. In particular, we present a factor graph representation that captures both forms of independence and thus enables efficient solutions. In addition, we show how this representation can provide leverage in sequential tasks by using it to construct a novel method for decentralized partially observable Markov decision processes. Experimental results in both random and benchmark tasks demonstrate the improved scalability of our methods compared to several existing alternatives.

agent, game theory, survey article, (15 more...)

arXiv.org Artificial Intelligence

1108.0404

Country:

Europe (1.00)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Industry: Leisure & Entertainment > Sports (0.45)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Report on the 2008 Reinforcement Learning Competition

Whiteson, Shimon (University of Amsterdam) | Tanner, Brian (University of Alberta) | White, Adam (University of Alberta)

AI MagazineJun-28-2010

Report on the 2008 Reinforcement Learning Competition

artificial intelligence, competition, reinforcement learning, (5 more...)

AI Magazine

Genre: Contests & Prizes (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.95)

Add feedback

Report on the 2008 Reinforcement Learning Competition

Whiteson, Shimon (University of Amsterdam) | Tanner, Brian (University of Alberta) | White, Adam (University of Alberta)

AI MagazineJun-28-2010

This article reports on the 2008 Reinforcement Learning Competition, which began in November 2007 and ended with a workshop at the International Conference on Machine Learning (ICML) in July 2008 in Helsinki, Finland. Researchers from around the world developed reinforcement learning agents to compete in six problems of various complexity and difficulty. The competition employed fundamentally redesigned evaluation frameworks that, unlike those in previous competitions, aimed to systematically encourage the submission of robust learning methods. We describe the unique challenges of empirical evaluation in reinforcement learning and briefly review the history of the previous competitions and the evaluation frameworks they employed. We also describe the novel frameworks developed for the 2008 competition as well as the software infrastructure on which they rely. Furthermore, we describe the six competition domains and present a summary of selected competition results. Finally, we discuss the implications of these results and outline ideas for the future of the competition.

artificial intelligence, competition, soccer, (17 more...)

AI Magazine

Country:

North America > United States (1.00)
North America > Canada > Alberta (0.29)
Europe > Finland > Uusimaa > Helsinki (0.24)

Genre:

Research Report (1.00)
Overview (1.00)
Instructional Material > Course Syllabus & Notes (0.68)

Industry:

Leisure & Entertainment > Sports > Soccer (0.94)
Leisure & Entertainment > Games (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback