Edmonton
Domain and Function: A Dual-Space Model of Semantic Relations and Compositions
Given appropriate representations of the semantic relations between carpenter and wood and between mason and stone (for example, vectors in a vector space model), a suitable algorithm should be able to recognize that these relations are highly similar (carpenter is to wood as mason is to stone; the relations are analogous). Likewise, with representations of dog, house, and kennel, an algorithm should be able to recognize that the semantic composition of dog and house, dog house, is highly similar to kennel (dog house and kennel are synonymous). It seems that these two tasks, recognizing relations and compositions, are closely connected. However, up to now, the best models for relations are significantly different from the best models for compositions. In this paper, we introduce a dual-space model that unifies these two tasks. This model matches the performance of the best previous models for relations and compositions. The dual-space model consists of a space for measuring domain similarity and a space for measuring function similarity. Carpenter and wood share the same domain, the domain of carpentry. Mason and stone share the same domain, the domain of masonry. Carpenter and mason share the same function, the function of artisans. Wood and stone share the same function, the function of materials. In the composition dog house, kennel has some domain overlap with both dog and house (the domains of pets and buildings). The function of kennel is similar to the function of house (the function of shelters). By combining domain and function similarities in various ways, we can model relations, compositions, and other aspects of semantics.
Using Sliding Windows to Generate Action Abstractions in Extensive-Form Games
Hawkin, John Alexander (University of Alberta) | Holte, Robert (University of Alberta) | Szafron, Duane (University of Alberta)
In extensive-form games with a large number of actions, careful abstraction of the action space is critically important to performance. In this paper we extend previous work on action abstraction using no-limit poker games as our test domains. We show that in such games it is no longer necessary to choose, a priori, one specific range of possible bet sizes. We introduce an algorithm that adjusts the range of bet sizes considered for each bet individually in an iterative fashion. This flexibility results in a substantially improved game value in no-limit Leduc poker. When applied to no-limit Texas Hold'em our algorithm produces an action abstraction that is about one third the size of a state of the art hand-crafted action abstraction, yet has a better overall game value.
Generating Coherent Summaries with Textual Aspects
Zhang, Renxian (The Hong Kong Polytechnic University) | Li, Wenjie (The Hong Kong Polytechnic University) | Gao, Dehong (The Hong Kong Polytechnic University)
Initiated by TAC 2010, aspect-guided summaries not only address specific user need, but also ameliorate content-level coherence by using aspect information. This paper presents a full-fledged system composed of three modules: finding sentence-level textual aspects, modeling aspect-based coherence with an HMM model, and selecting and ordering sentences with aspect information to generate coherent summaries. The evaluation results on the TAC 2011 datasets show the superiority of aspect-guided summaries in terms of both information coverage and textual coherence.
Finding Optimal Abstract Strategies in Extensive-Form Games
Johanson, Michael (University of Alberta) | Bard, Nolan (University of Alberta) | Burch, Neil (University of Alberta) | Bowling, Michael (University of Alberta)
Extensive-form games are a powerful model for representing interactions between agents. Nash equilibrium strategies are a common solution concept for extensive-form games and, in two-player zero-sum games, there are efficient algorithms for calculating such strategies. In large games, this computation may require too much memory and time to be tractable. A standard approach in such cases is to apply a lossy state-space abstraction technique to produce a smaller abstract game that can be tractably solved, while hoping that the resulting abstract game equilibrium is close to an equilibrium strategy in the unabstracted game. Recent work has shown that this assumption is unreliable, and an arbitrary Nash equilibrium in the abstract game is unlikely to be even near the least suboptimal strategy that can be represented in that space. In this work, we present for the first time an algorithm which efficiently finds optimal abstract strategies --- strategies with minimal exploitability in the unabstracted game. We use this technique to find the least exploitable strategy ever reported for two-player limit Texas hold'em.
Generalized Sampling and Variance in Counterfactual Regret Minimization
Gibson, Richard (University of Alberta) | Lanctot, Marc (University of Alberta) | Burch, Neil (University of Alberta) | Szafron, Duane (University of Alberta) | Bowling, Michael (University of Alberta)
In large extensive form games with imperfect information, Counterfactual Regret Minimization (CFR) is a popular, iterative algorithm for computing approximate Nash equilibria. While the base algorithm performs a full tree traversal on each iteration, Monte Carlo CFR (MCCFR) reduces the per iteration time cost by traversing just a sampled portion of the tree. On the other hand, MCCFR's sampled values introduce variance, and the effects of this variance were previously unknown. In this paper, we generalize MCCFR by considering any generic estimator of the sought values. We show that any choice of an estimator can be used to probabilistically minimize regret, provided the estimator is bounded and unbiased. In addition, we relate the variance of the estimator to the convergence rate of an algorithm that calculates regret directly from the estimator. We demonstrate the application of our analysis by defining a new bounded, unbiased estimator with empirically lower variance than MCCFR estimates. Finally, we use this estimator in a new sampling algorithm to compute approximate equilibria in Goofspiel, Bluff, and Texas hold'em poker. Under each of our selected sampling schemes, our new algorithm converges faster than MCCFR.
Approximate Policy Iteration with Linear Action Models
Yao, Hengshuai (University of Alberta) | Szepesvari, Csaba (University of Alberta)
In this paper we consider the problem of finding a good policy given some batch data.We propose a new approach, LAM-API, that first builds a so-called linear action model (LAM) from the data and then uses the learned model and the collected data in approximate policy iteration (API) to find a good policy.A natural choice for the policy evaluation step in this algorithm is to use least-squares temporal difference (LSTD) learning algorithm.Empirical results on three benchmark problems show that this particular instance of LAM-API performs competitively as compared with LSPI, both from the point of view of data and computational efficiency.
Investigating Contingency Awareness Using Atari 2600 Games
Bellemare, Marc G. (University of Alberta) | Veness, Joel (University of Alberta) | Bowling, Michael (University of Alberta)
Contingency awareness is the recognition that some aspects of a future observation are under an agent's control while others are solely determined by the environment. This paper explores the idea of contingency awareness in reinforcement learning using the platform of Atari 2600 games. We introduce a technique for accurately identifying contingent regions and describe how to exploit this knowledge to generate improved features for value function approximation. We evaluate the performance of our techniques empirically, using 46 unseen, diverse, and challenging games for the Atari 2600 console. Our results suggest that contingency awareness is a generally useful concept for model-free reinforcement learning agents.
A Well-Founded Semantics for Basic Logic Programs with Arbitrary Abstract Constraint Atoms
Wang, Yisong (Guizhou University) | Lin, Fangzhen (Hong Kong University of Science and Technology) | Zhang, Mingyi (Guizhou Academy of Sciences) | You, Jia-Huai (University of Alberta)
Logic programs with abstract constraint atoms proposed by Marek and Truszczynski are very general logic programs.They are general enough to captureaggregate logic programs as well asrecently proposed description logic programs.In this paper, we propose a well-founded semantics for basic logic programs with arbitrary abstract constraint atoms, which are sets of rules whose heads have exactly one atom. Weshow that similar to the well-founded semanticsof normal logic programs, it has many desirable properties such as that it can becomputed in polynomial time, and is always correct with respect to theanswer set semantics. This paves the way for using our well-founded semanticsto simplify these logic programs. We also show how our semantics can be applied toaggregate logic programs and description logic programs, and compare itto the well-founded semantics already proposed for these logic programs.
Fast and Accurate Predictions of IDA*'s Performance
Lelis, Levi H. S. (University of Alberta) | Zilles, Sandra (University of Regina) | Holte, Robert C. (University of Alberta)
Korf, Reid and Edelkamp initiated a line of research for developing methods (KRE and later CDP) that predict the number of nodes expanded by IDA* for a given start state and cost bound. Independent of that, Chen developed a method (SS) that can also be used to predict the number of nodes expanded by IDA*. In this paper we advance both of these prediction methods. First, we develop a variant of CDP that can be orders of magnitude faster than CDP while producing exactly the same predictions. Second, we show how ideas developed in the KRE line of research can be used to substantially improve the predictions produced by SS. Third, we make an empirical comparison between our new enhanced versions of CDP and SS. Our experimental results point out that CDP is suitable for applications that require less accurate but very fast predictions, while SS is suitable for applications that require more accurate predictions but allow more computation time.
Speeding Up Planning in Markov Decision Processes via Automatically Constructed Abstractions
Isaza, Alejandro, Szepesvari, Csaba, Bulitko, Vadim, Greiner, Russell
In this paper, we consider planning in stochastic shortest path (SSP) problems, a subclass of Markov Decision Problems (MDP). We focus on medium-size problems whose state space can be fully enumerated. This problem has numerous important applications, such as navigation and planning under uncertainty. We propose a new approach for constructing a multi-level hierarchy of progressively simpler abstractions of the original problem. Once computed, the hierarchy can be used to speed up planning by first finding a policy for the most abstract level and then recursively refining it into a solution to the original problem. This approach is fully automated and delivers a speed-up of two orders of magnitude over a state-of-the-art MDP solver on sample problems while returning near-optimal solutions. We also prove theoretical bounds on the loss of solution optimality resulting from the use of abstractions.