Goto

Collaborating Authors

 Edmonton


Puppet Search: Enhancing Scripted Behavior by Look-Ahead Search with Applications to Real-Time Strategy Games

AAAI Conferences

Real-Time Strategy (RTS) games have shown to be very resilient to standard  adversarial tree search techniques. Recently, a few approaches to tackle their complexity have emerged that use game state or move abstractions, or both. Unfortunately, the supporting experiments were either limited to simpler RTS environments ( u RTS, SparCraft) or lack testing against state-of-the-art game playing agents. Here, we propose Puppet Search , a new adversarial search framework based on scripts that can expose choice points to a look-ahead search procedure. Selecting a combination of a script and decisions for its choice points represents a move to be applied next. Such moves can be executed in the actual game, thus letting the script play, or in an abstract representation of the game state which can be used by an adversarial tree search algorithm. Puppet Search returns a principal variation of scripts and choices to be executed by the agent for a given time span. We implemented the algorithm in a complete StarCraft bot. Experiments show that it matches or outperforms all of the individual scripts that it uses when playing against state-of-the-art bots from the 2014 AIIDE StarCraft competition.


Utility-based Dueling Bandits as a Partial Monitoring Game

arXiv.org Artificial Intelligence

Partial monitoring is a generic framework for sequential decision-making with incomplete feedback. It encompasses a wide class of problems such as dueling bandits, learning with expect advice, dynamic pricing, dark pools, and label efficient prediction. We study the utility-based dueling bandit problem as an instance of partial monitoring problem and prove that it fits the time-regret partial monitoring hierarchy as an easy - i.e. Theta (sqrt{T})- instance. We survey some partial monitoring algorithms and see how they could be used to solve dueling bandits efficiently. Keywords: Online learning, Dueling Bandits, Partial Monitoring, Partial Feedback, Multiarmed Bandits


Fast Cross-Validation for Incremental Learning

AAAI Conferences

Cross-validation (CV) is one of the main tools for performance estimation and parameter tuning in machine learning. The general recipe for computing CV estimate is to run a learning algorithm separately for each CV fold, a computationally expensive process. In this paper, we propose a new approach to reduce the computational burden of CV-based performance estimation. As opposed to all previous attempts, which are specific to a particular learning model or problem domain, we propose a general method applicable to a large class of incremental learning algorithms, which are uniquely fitted to big data problems. In particular, our method applies to a wide range of supervised and unsupervised learning tasks with different performance criteria, as long as the base learning algorithm is incremental. We show that the running time of the algorithm scales logarithmically, rather than linearly, in the number of CV folds. Furthermore, the algorithm has favorable properties for parallel and distributed implementation. Experiments with state-of-the-art incremental learning algorithms confirm the practicality of the proposed method.


Kernel Contraction and Base Dependence: Redundancy in the Base Resulting in Different Types of Dependence

AAAI Conferences

The AGM paradigm of belief change studies the dynamics of belief states in light of new information. Finding, or even approximating, dependent or relevant beliefs to a change is valuable because, for example, it can narrow the set of beliefs considered during belief change operations. Gärdenfors' preservation criterion (GPC) suggests that formulas independent of a belief change should remain intact. GPC allows to build dependence relations that are theoretically linked with belief change. Such dependence relations can in turn be used as a theoretical benchmark against which to evaluate other approximate dependence or relevance relations. There are already some studies, based on GPC, on the parallelism between belief change and dependence. One study offers a dependence relation parallel to AGM contraction for belief sets. Another study links base dependence relation to a more general belief base contraction, saturated kernel contraction. Here we offer yet a more general parallelism between kernel contraction and base dependence. At this level of generalization, different types of base dependence emerge. We prove that this differentiation of base dependence types is a result of possible redundancy in the base. This provides a theoretical means to distinguish between redundant and informative parts of a belief base.


On Forgetting Postulates in Answer Set Programming

AAAI Conferences

Forgetting is an important mechanism for logic-based agent systems. A recent interest has been in the desirable properties of forgetting in answer set programming  (ASP)and their impact on the design of forgetting operators. It is known that some subsets of these propertiesare incompatible, i.e., they cannot be satisfied at the same time. In this paper, we are interested in the question onthe largest set Δ of pairs (Π, V), where Π is  a logic program and V is a set of atoms, such that a forgetting operator exists that satisfies all the desirable properties for each  (Π, V) in Δ.  We answer this question positively by discovering the precise condition under which the knowledge forgetting, a well-established approach to forgetting in ASP, satisfies the property of strong persistence, which leads to a sufficient and necessary condition for a forgetting operator to satisfy all the desirable properties proposed in the literature. We explore computational complexities on checking the condition and present a syntactic characterization which can serve as the basis of computing knowledge forgetting in ASP.


Adversarial Hierarchical-Task Network Planning for Complex Real-Time Games

AAAI Conferences

Real-time strategy (RTS) games are hard from an AI point of view because they have enormous state spaces, combinatorial branching factors, allow simultaneous and durative actions, and players have very little time to choose actions. For these reasons, standard game tree search methods such as alpha- beta search or Monte Carlo Tree Search (MCTS) are not sufficient by themselves to handle these games. This paper presents an alternative approach called Adversarial Hierarchical Task Network (AHTN) planning that combines ideas from game tree search with HTN planning. We present the basic algorithm, relate it to existing adversarial hierarchical planning methods, and present new extensions for simultaneous and durative actions to handle RTS games. We also present empirical results for the μRTS game, comparing it to other state of the art search algorithms for RTS games.


Correcting Covariate Shift with the Frank-Wolfe Algorithm

AAAI Conferences

Covariate shift is a fundamental problem for learning in non-stationary environments where the conditional distribution p(y|x) is the same between training and test data while their marginal distributions p tr (x) and p te (x) are different. Although many covariate shift correction techniques remain effective for real world problems, most do not scale well in practice. In this paper, using inspiration from recent optimization techniques, we apply the Frank-Wolfe algorithm to two well-known covariate shift correction techniques, Kernel Mean Matching (KMM) and Kullback-Leibler Importance Estimation Procedure (KLIEP), and identify an important connection between kernel herding and KMM. Our complexity analysis shows the benefits of the Frank-Wolfe approach over projected gradient methods in solving KMM and KLIEP. An empirical study then demonstrates the effectiveness and efficiency of the Frank-Wolfe algorithm for correcting covariate shift in practice.


A Fast Goal Recognition Technique Based on Interaction Estimates

AAAI Conferences

Goal Recognition is the task of inferring an actor's goals given some or all of the actor's observed actions. There is considerable interest in Goal Recognition for use in intelligent personal assistants, smart environments, intelligent tutoring systems, and monitoring user's needs. In much of this work, the actor's observed actions are compared against a generated library of plans. Recent work by Ramirez and Geffner makes use of AI planning to determine how closely a sequence of observed actions matches plans for each possible goal. For each goal, this is done by comparing the cost of a plan for that goal with the cost of a plan for that goal that includes the observed actions. This approach yields useful rankings, but is impractical for real-time goal recognition in large domains because of the computational expense of constructing plans for each possible goal. In this paper, we introduce an approach that propagates cost and interaction information in a plan graph, and uses this information to estimate goal probabilities. We show that this approach is much faster, but still yields high quality results.


Using Machine Translation to Provide Target-Language Edit Hints in Computer Aided Translation Based on Translation Memories

Journal of Artificial Intelligence Research

This paper explores the use of general-purpose machine translation (MT) in assisting the users of computer-aided translation (CAT) systems based on translation memory (TM) to identify the target words in the translation proposals that need to be changed (either replaced or removed) or kept unedited, a task we term as "word-keeping recommendation". MT is used as a black box to align source and target sub-segments on the fly in the translation units (TUs) suggested to the user. Source-language (SL) and target-language (TL) segments in the matching TUs are segmented into overlapping sub-segments of variable length and machine-translated into the TL and the SL, respectively. The bilingual sub-segments obtained and the matching between the SL segment in the TU and the segment to be translated are employed to build the features that are then used by a binary classifier to determine the target words to be changed and those to be kept unedited. In this approach, MT results are never presented to the translator. Two approaches are presented in this work: one using a word-keeping recommendation system which can be trained on the TM used with the CAT system, and a more basic approach which does not require any training. Experiments are conducted by simulating the translation of texts in several language pairs with corpora belonging to different domains and using three different MT systems. We compare the performance obtained to that of previous works that have used statistical word alignment for word-keeping recommendation, and show that the MT-based approaches presented in this paper are more accurate in most scenarios. In particular, our results confirm that the MT-based approaches are better than the alignment-based approach when using models trained on out-of-domain TMs. Additional experiments were performed to check how dependent the MT-based recommender is on the language pair and MT system used for training. These experiments confirm a high degree of reusability of the recommendation models across various MT systems, but a low level of reusability across language pairs.


The Spurious Path Problem in Abstraction

AAAI Conferences

Abstraction is a powerful technique in search and planning. A fundamental problem of abstraction is that it can create spurious paths, i.e., abstract paths that do not correspond to valid concrete paths. In this paper, we define spurious paths as a generalization of spurious states. We show that spurious paths can be categorized into two types: state-independent spurious paths and state-specific spurious paths. We present a practical method that eliminates state-independent spurious paths, as well as state-specific spurious paths when integrated with mutex detection methods. We provide syntactical conditions under which our method can remove state-independent spurious paths completely. We demonstrate that eliminating spurious paths can improve a heuristic substantially, even in abstract spaces that are free of spurious states.