AITopics | Edmonton

Collaborating Authors

Edmonton

Gap-Increasing Policy Evaluation for Efficient and Noise-Tolerant Reinforcement Learning

Kozuno, Tadashi, Han, Dongqi, Doya, Kenji

arXiv.org Machine LearningJun-18-2019

In real-world applications of reinforcement learning (RL), noise from inherent stochasticity of environments is inevitable. However, current policy evaluation algorithms, which plays a key role in many RL algorithms, are either prone to noise or inefficient. To solve this issue, we introduce a novel policy evaluation algorithm, which we call Gap-increasing RetrAce Policy Evaluation (GRAPE). It leverages two recent ideas: (1) gap-increasing value update operators in advantage learning for noise-tolerance and (2) off-policy eligibility trace in Retrace algorithm for efficient learning. We provide detailed theoretical analysis of the new algorithm that shows its efficiency and noise-tolerance inherited from Retrace and advantage learning. Furthermore, our analysis shows that GRAPE's learning is significantly efficient than that of a simple learning-rate-based approach while keeping the same level of noise-tolerance. We applied GRAPE to control problems and obtained experimental results supporting our theoretical analysis.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Machine Learning

1906.07586

Country:

Asia > Japan > Kyūshū & Okinawa > Okinawa (0.05)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > New Hampshire > Hillsborough County > Nashua (0.04)
(4 more...)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Artificial Intelligence Game Talk, University of Alberta, Hex and Chess

#artificialintelligenceJun-15-2019, 18:09:14 GMT

U of Alberta created the first Computing Science department in Canada in 1964. It has a long tradition of research in AI (is rated 3rd in the world in machine learning). It has also led in the development of AI for strategy games. The results can be commercialized in non-game applications as well. Among these are Checkers, Chess, Go and Poker, The evening's talks were by Jonathan Schaeffer (computer chess) and Ryan Hayward (the strategy game Hex).

alberta, artificial intelligence, university, (14 more...)

#artificialintelligence

Country:

North America > Canada > Ontario > Toronto (0.15)
North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.05)
Europe > Denmark (0.05)

Industry: Leisure & Entertainment > Games > Chess (1.00)

Technology: Information Technology > Artificial Intelligence > Games > Chess (1.00)

Add feedback

Exponential-Binary State-Space Search

Sturtevant, Nathan, Helmert, Malte

arXiv.org Artificial IntelligenceJun-7-2019

Iterative deepening search is used in applications where the best cost bound for state-space search is unknown. The iterative deepening process is used to avoid overshooting the appropriate cost bound and doing too much work as a result. However, iterative deepening search also does too much work if the cost bound grows too slowly. This paper proposes a new framework for iterative deepening search called exponential-binary state-space search. The approach interleaves exponential and binary searches to find the desired cost bound, reducing the worst-case overhead from polynomial to logarithmic. Exponential-binary search can be used with bounded depth-first search to improve the worst-case performance of IDA* and with breadth-first heuristic search to improve the worst-case performance of search with inconsistent heuristics.

algorithm, artificial intelligence, expansion, (15 more...)

arXiv.org Artificial Intelligence

1906.02912

Country:

North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)
Europe > Switzerland > Basel-City > Basel (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Add feedback

Ease-of-Teaching and Language Structure from Emergent Communication

Li, Fushan, Bowling, Michael

arXiv.org Artificial IntelligenceJun-5-2019

Artificial agents have been shown to learn to communicate when needed to complete a cooperative task. Some level of language structure (e.g., compositionality) has been found in the learned communication protocols. This observed structure is often the result of specific environmental pressures during training. By introducing new agents periodically to replace old ones, sequentially and within a population, we explore such a new pressure -- ease of teaching -- and show its impact on the structure of the resulting language.

listener, regime, topo 0, (14 more...)

arXiv.org Artificial Intelligence

1906.02403

Country:

North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.14)
Europe > Denmark > Capital Region > Copenhagen (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Policy Based Inference in Trick-Taking Card Games

Rebstock, Douglas, Solinas, Christopher, Buro, Michael, Sturtevant, Nathan R.

arXiv.org Artificial IntelligenceMay-26-2019

Trick-taking card games feature a large amount of private information that slowly gets revealed through a long sequence of actions. This makes the number of histories exponentially large in the action sequence length, as well as creating extremely large information sets. As a result, these games become too large to solve. To deal with these issues many algorithms employ inference, the estimation of the probability of states within an information set. In this paper, we demonstrate a Policy Based Inference (PI) algorithm that uses player modelling to infer the probability we are in a given state. We perform experiments in the German trick-taking card game Skat, in which we show that this method vastly improves the inference as compared to previous work, and increases the performance of the state-of-the-art Skat AI system Kermit when it is employed into its determinized search algorithm.

artificial intelligence, inference, machine learning, (17 more...)

arXiv.org Artificial Intelligence

1905.10911

Country:

North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)
Europe > Germany (0.04)
Asia > Malaysia (0.04)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Games > Bridge (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.88)

Add feedback

Learning Policies from Human Data for Skat

Rebstock, Douglas, Solinas, Christopher, Buro, Michael

arXiv.org Artificial IntelligenceMay-26-2019

Decision-making in large imperfect information games is difficult. Thanks to recent success in Poker, Counterfactual Regret Minimization (CFR) methods have been at the forefront of research in these games. However, most of the success in large games comes with the use of a forward model and powerful state abstractions. In trick-taking card games like Bridge or Skat, large information sets and an inability to advance the simulation without fully determinizing the state make forward search problematic. Furthermore, state abstractions can be especially difficult to construct because the precise holdings of each player directly impact move values. In this paper we explore learning model-free policies for Skat from human game data using deep neural networks (DNN). We produce a new state-of-the-art system for bidding and game declaration by introducing methods to a) directly vary the aggressiveness of the bidder and b) declare games based on expected value while mitigating issues with rarely observed state-action pairs. Although cardplay policies learned through imitation are slightly weaker than the current best search-based method, they run orders of magnitude faster. We also explore how these policies could be learned directly from experience in a reinforcement learning setting and discuss the value of incorporating human data for this task.

artificial intelligence, kermit, machine learning, (18 more...)

arXiv.org Artificial Intelligence

1905.10907

Country:

North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)
Europe > Germany (0.04)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.69)

Add feedback

Multivariate Time Series Classification using Dilated Convolutional Neural Network

Yazdanbakhsh, Omolbanin, Dick, Scott

arXiv.org Machine LearningMay-5-2019

General approach for time series classification is splitting time series to equal size Multivariate time series classification is a high segments using a fixed-length sliding window and extracting value and well-known problem in machine learning handcrafted features from the segments for classification community. Feature extraction is a main step tasks. The features are usually statistical measurements or in classification tasks. Traditional approaches employ features extracted from another domain such Fourier and handcrafted features for classification while Wavelet domain (Jiang & Yin, 2015; Ravi et al., 2017; Lin convolutional neural networks (CNN) are able et al., 2003). In multivariate time series classification, commonly, to extract features automatically. In this paper, information is extracted separately from each variate, we use dilated convolutional neural network for and the features are concatenated for the classification task multivariate time series classification.

artificial intelligence, machine learning, time sery, (13 more...)

arXiv.org Machine Learning

1905.01697

Country:

North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.05)
North America > United States > California > Los Angeles County > Long Beach (0.04)
North America > Canada > Alberta > Census Division No. 6 > Calgary Metropolitan Region > Calgary (0.04)
North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

AAAI News

Hamilton, Carol (Association for the Advancement of Artificial Intelligence)

AI MagazineApr-4-2019

Submissions for HCOMP-19 Are Due in June! The Seventh AAAI Conference on Human Computation and Crowdsourcing (HCOMP 2019) will be held October 28-30 at Skamania Lodge in Washington State near the Columbia Gorge River, just 45 minutes from Portland, Oregon. This year is the 10-year anniversary of the very first HCOMP workshop in Paris, and to celebrate, there will be special events, talks, and panels throughout the conference. HCOMP is the premier venue for disseminating the latest research findings on crowdsourcing and human computation. While artificial intelligence (AI) and human-computer interaction (HCI) represent traditional mainstays of the conference, HCOMP believes strongly in inviting, fostering, and promoting broad, interdisciplinary research.

aaai, computer science, university, (13 more...)

AI Magazine

Country:

North America > United States > Washington (0.24)
North America > United States > Oregon > Multnomah County > Portland (0.24)
North America > United States > California > San Francisco County > San Francisco (0.14)
(25 more...)

Genre:

Personal > Honors (1.00)
Research Report (0.86)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Government > Military (1.00)
Banking & Finance (0.93)
(2 more...)

Technology:

Information Technology > Human Computer Interaction (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
(4 more...)

Add feedback

On the Functional Equivalence of TSK Fuzzy Systems to Neural Networks, Mixture of Experts, CART, and Stacking Ensemble Regression

Wu, Dongrui, Lin, Chin-Teng, Huang, Jian, Zeng, Zhigang

arXiv.org Artificial IntelligenceMar-25-2019

Fuzzy systems have achieved great success in numerous applications. However, there are still many challenges in designing an optimal fuzzy system, e.g., how to efficiently train its parameters, how to improve its performance without adding too many parameters, how to balance the trade-off between cooperations and competitions among the rules, how to overcome the curse of dimensionality, etc. Literature has shown that by making appropriate connections between fuzzy systems and other machine learning approaches, good practices from other domains may be used to improve the fuzzy systems, and vice versa. This paper gives an overview on the functional equivalence between Takagi-Sugeno-Kang fuzzy systems and four classic machine learning approaches -- neural networks, mixture of experts, classification and regression trees, and stacking ensemble regression -- for regression problems. We also point out some promising new research directions, inspired by the functional equivalence, that could lead to solutions to the aforementioned problems. To our knowledge, this is so far the most comprehensive overview on the connections between fuzzy systems and other popular machine learning approaches, and hopefully will stimulate more hybridization between different machine learning algorithms.

artificial intelligence, fuzzy logic, machine learning, (14 more...)

arXiv.org Artificial Intelligence

1903.10572

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
South America > Brazil (0.04)
Oceania > Australia > Western Australia > Perth (0.04)
(16 more...)

Genre: Overview (1.00)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Improving Search with Supervised Learning in Trick-Based Card Games

Solinas, Christopher, Rebstock, Douglas, Buro, Michael

arXiv.org Artificial IntelligenceMar-22-2019

In trick-taking card games, a two-step process of state sampling and evaluation is widely used to approximate move values. While the evaluation component is vital, the accuracy of move value estimates is also fundamentally linked to how well the sampling distribution corresponds the true distribution. Despite this, recent work in trick-taking card game AI has mainly focused on improving evaluation algorithms with limited work on improving sampling. In this paper, we focus on the effect of sampling on the strength of a player and propose a novel method of sampling more realistic states given move history. In particular, we use predictions about locations of individual cards made by a deep neural network --- trained on data from human gameplay - in order to sample likely worlds for evaluation. This technique, used in conjunction with Perfect Information Monte Carlo (PIMC) search, provides a substantial increase in cardplay strength in the popular trick-taking card game of Skat.

artificial intelligence, information, machine learning, (17 more...)

arXiv.org Artificial Intelligence

1903.09604

Country:

North America > United States > Texas (0.04)
North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)
Europe > Germany (0.04)

Genre: Research Report (0.84)

Industry: Leisure & Entertainment > Games > Bridge (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.87)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.68)

Add feedback