AITopics

We present a unified, black-box-style method for developing and analyzing online convex optimization (OCO) algorithms for full-information online learning in delayed-feedback environments. Our new, simplified analysis enables us to substantially improve upon previous work and to solve a number of open problems from the literature. Specifically, we develop and analyze asynchronous AdaGrad-style algorithms from the Follow-the-Regularized-Leader (FTRL) and Mirror-Descent family that, unlike previous works, can handle projections and adapt both to the gradients and the delays, without relying on either strong convexity or smoothness of the objective function, or data sparsity. Our unified framework builds on a natural reduction from delayed-feedback to standard (non-delayed) online learning. This reduction, together with recent unification results for OCO algorithms, allows us to analyze the regret of generic FTRL and Mirror-Descent algorithms in the delayed-feedback setting in a unified manner using standard proof techniques. In addition, the reduction is exact and can be used to obtain both upper and lower bounds on the regret in the delayed-feedback setting.

algorithm, computer based training, educational technology, (21 more...)

Thirtieth AAAI Conference on Artificial Intelligence

Country: North America > Canada > Alberta (0.28)

Industry: Education > Educational Setting (0.57)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

On the Completeness of Best-First Search Variants That Use Random Exploration

Valenzano, Richard Anthony (University of Toronto) | Xie, Fan (University of Alberta)

While suboptimal best-first search algorithms like Greedy Best-First Search are frequently used when building automated planning systems, their greedy nature can make them susceptible to being easily misled by flawed heuristics. This weakness has motivated the development of best-first search variants like epsilon-greedy node selection, type-based exploration, and diverse best-first search, which all use random exploration to mitigate the impact of heuristic error. In this paper, we provide a theoretical justification for this increased robustness by formally analyzing how these algorithms behave on infinite graphs. In particular, we show that when using these approaches on any infinite graph, the probability of not finding a solution can be made arbitrarily small given enough time. This result is shown to hold for a class of algorithms that includes the three mentioned above, regardless of how misleading the heuristic is.

algorithm, artificial intelligence, node, (18 more...)

Thirtieth AAAI Conference on Artificial Intelligence

Country:

North America > Canada > Ontario > Toronto (0.28)
North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.14)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)

Compressed Conditional Mean Embeddings for Model-Based Reinforcement Learning

Lever, Guy (University College London) | Shawe-Taylor, John (University College London) | Stafford, Ronnie (University College London) | Szepesvari, Csaba (University of Alberta)

We present a model-based approach to solving Markov decision processes (MDPs) in which the system dynamics are learned using conditional mean embeddings (CMEs). This class of methods comes with strong performance guarantees, and enables planning to be performed in an induced finite (pseudo-)MDP, which approximates the MDP, but can be solved exactly using dynamic programming. Two drawbacks of existing methods exist: firstly, the size of the induced finite (pseudo-)MDP scales quadratically with the amount of data used to learn the model, costing much memory and time when planning with the learned model; secondly, learning the CME itself using powerful kernel least-squares is costly – a second computational bottleneck. We present an algorithm which maintains a rich kernelized CME model class, but solves both problems: firstly we demonstrate that the loss function for the CME model suggests a principled approach to compressing the induced (pseudo-)MDP, leading to faster planning, while maintaining guarantees; secondly we propose to learn the CME model itself using fast sparse-greedy kernel regression well-suited to the RL context. We demonstrate superior performance to existing methods in this class of modelbased approaches on a range of MDPs.

Using Correlated Strategies for Computing Stackelberg Equilibria in Extensive-Form Games

Cermak, Jiri (Czech Technical University in Prague) | Bosansky, Branislav (Czech Technical University in Prague) | Durkota, Karel (Czech Technical University in Prague) | Lisy, Viliam (University of Alberta) | Kiekintveld, Christopher ( University of Texas at El Paso )

Strong Stackelberg Equilibrium (SSE) is a fundamental solution concept in game theory in which one player commits to a strategy, while the other player observes this commitment and plays a best response. We present a new algorithm for computing SSE for two-player extensive-form general-sum games with imperfect information (EFGs) where computing SSE is an NP-hard problem. Our algorithm is based on a correlated version of SSE, known as Stackelberg Extensive-Form Correlated Equilibrium (SEFCE). Our contribution is therefore twofold: (1) we give the first linear program for computing SEFCE in EFGs without chance, (2) we repeatedly solve and modify this linear program in a systematic search until we arrive to SSE. Our new algorithm outperforms the best previous algorithms by several orders of magnitude.

artificial intelligence, follower, game theory, (18 more...)

Thirtieth AAAI Conference on Artificial Intelligence

Country:

North America > United States (0.28)
North America > Canada > Alberta (0.14)

Genre: Research Report (0.46)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)

Factorization Ranking Model for Move Prediction in the Game of Go

Xiao, Chenjun (University of Alberta) | Müller, Martin (University of Alberta)

In this paper, we investigate the move prediction problem in the game of Go by proposing a new ranking model named Factorization Bradley Terry (FBT) model. This new model considers the move prediction problem as group competitions while also taking the interaction between features into account. A FBT model is able to provide a probability distribution that expresses a preference over moves. Therefore it can be easily compiled into an evaluation function and applied in a modern Go program. We propose a Stochastic Gradient Decent (SGD) algorithm to train a FBT model using expert game records, and provide two methods for fast computation of the gradient in order to speed up the training process. Experimental results show that our FBT model outperforms the state-of-the-art move prediction system of Latent Factor Ranking (LFR).

artificial intelligence, machine learning, move prediction, (17 more...)

Thirtieth AAAI Conference on Artificial Intelligence

Country: North America > Canada > Alberta (0.28)

Genre: Research Report > New Finding (0.34)

Industry: Leisure & Entertainment > Games > Go (0.85)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.49)

Bidirectional Search That Is Guaranteed to Meet in the Middle

Holte, Robert C. (University of Alberta) | Felner, Ariel (Ben-Gurion University) | Sharon, Guni (Ben-Gurion University) | Sturtevant, Nathan R. (University of Denver)

We present MM, the first bidirectional heuristic search algorithm whose forward and backward searches are guaranteed to ''meet in the middle'', i.e. never expand a node beyond the solution midpoint. We also present a novel framework for comparing MM, A*, and brute-force search, and identify conditions favoring each algorithm. Finally, we present experimental results that support our theoretical analysis.

artificial intelligence, mm 0, node, (17 more...)

Thirtieth AAAI Conference on Artificial Intelligence

Country: North America > Canada > Alberta (0.28)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)

Playable Experiences at AIIDE 2015

Cook, Michael (Falmouth University) | Eiserloh, Squirrel (Southern Methodist University) | Robertson, Justus (North Carolina State University) | Young, R. Michael (North Carolina State University) | Thompson, Tommy (Table Flip Games / University of Derby) | Churchill, David (Lunarch Studios / University of Alberta) | Cerny, Martin (Charles University in Prague) | Hernandez, Sergio Poo (University of Alberta) | Bulitko, Vadim (University of Alberta)

This paper describes entries to the third Playable Experiences track to be held at the AIIDE conference. We discuss the five entries accepted to the track for 2015, as well as the ongoing development of the track as part of AIIDE.

computer game, experience manager, survey article, (17 more...)

Eleventh Artificial Intelligence and Interactive Digital Entertainment Conference

Country: North America > Canada > Alberta (0.15)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.69)

Keeping the Player on an Emotional Trajectory in Interactive Storytelling

Hernandez, Sergio Poo (University of Alberta) | Bulitko, Vadim (University of Alberta) | Spetch, Marcia (University of Alberta)

Artificial Intelligence (AI) techniques have been widely used in video games to control non-playable characters. More recently, AI has been applied to automated story generation with the objective of managing the player's experience in an interactive narrative. Such AI experience managers can generate and adapt narrative dynamically, often in response to the player's in-game actions. We implement and evaluate a recently proposed AI experience manager, PACE, which predicts the player's emotional response to a narrative event and uses such predictions to shape the narrative to keep the player on an author-supplied target emotional curve.

computer game, narrative, neural network, (17 more...)

Eleventh Artificial Intelligence and Interactive Digital Entertainment Conference

Country: North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.14)

Genre: Research Report > Experimental Study (0.46)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.36)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.69)

Using Lanchester Attrition Laws for Combat Prediction in StarCraft

Stanescu, Marius Adrian (University of Alberta) | Barriga, Nicolas (University of Alberta) | Buro, Michael (University of Alberta)

Smart decision making at the tactical level is important for Artificial Intelligence (AI) agents to perform well in the domain of real-time strategy (RTS) games. Winning battles is crucial in RTS games, and while humans can decide when and how to attack based on their experience, it is challenging for AI agents to estimate combat outcomes accurately. A few existing models address this problem in the game of StarCraft but present many restrictions, such as not modeling injured units, supporting only a small number of unit types, or being able to predict the winner of a fight but not the remaining army. Prediction using simulations is a popular method, but generally slow and requires extensive coding to model the game engine accurately. This paper introduces a model based on Lanchester's attrition laws which addresses the mentioned limitations while being faster than running simulations. Unit strength values are learned using maximum likelihood estimation from past recorded battles. We present experiments that use a StarCraft simulator for generating battles for both training and testing, and show that the model is capable of making accurate predictions. Furthermore, we implemented our method in a StarCraft bot that uses either this or traditional simulations to decide when to attack or to retreat. We present tournament results (against top bots from 2014 AIIDE competition) comparing the performances of the two versions, and show increased winning percentages for our method.

army, bayesian inference, computer game, (19 more...)

Eleventh Artificial Intelligence and Interactive Digital Entertainment Conference

Country: North America > Canada > Alberta (0.28)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.87)

Maximizing Flow as a Metacontrol in Angband

Mariusdottir, Thorey Maria (University of Alberta) | Bulitko, Vadim (University of Alberta) | Brown, Matthew (University of Alberta)

Flow is a psychological state that is reported to improve people’s performance. Flow can emerge when the person’s skills and the challenges of their activity match. This paper applies this concept to artificial intelligence agents. We equip a decision-making agent with a metacontrol policy that guides the agent to activities where the agent’s skills match the activity difficulty. Consequently, we expect the agent’s performance to improve. We implement and evaluate this approach in the role-playing game of Angband.

agent, artificial intelligence, computer game, (19 more...)

Eleventh Artificial Intelligence and Interactive Digital Entertainment Conference

Country: North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.14)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)