AITopics

1904.07366

Genre: Research Report (0.50)

Industry: Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)

#artificialintelligenceApr-14-2019, 10:53:11 GMT

Create a Connect Four AI using Python

Learn how to create an expert level artificial intelligence to play Connect Four using Python. The concepts you learn in this tutorial from Keith Galli can apply to creating AIs for other games as well. The tutorial starts out with a very simple implementation, then progresses to choosing a column based on score. Finally, the minimax algorithm is implemented with alpha beta pruning. You can watch the full video course on the freeCodeCamp.org

artificial intelligence, python

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.78)

Junyent, Miquel, Jonsson, Anders, Gómez, Vicenç

Deep Policies for Width-Based Planning in Pixel Domains

arXiv.org Artificial IntelligenceApr-12-2019

Width-based planning has demonstrated great success in recent years due to its ability to scale independently of the size of the state space. For example, Bandres et al. (2018) introduced a rollout version of the Iterated Width algorithm whose performance compares well with humans and learning methods in the pixel setting of the Atari games suite. In this setting, planning is done on-line using the "screen" states and selecting actions by looking ahead into the future. However, this algorithm is purely exploratory and does not leverage past reward information. Furthermore, it requires the state to be factored into features that need to be pre-defined for the particular task, e.g., the B-PROST pixel features. In this work, we extend width-based planning by incorporating an explicit policy in the action selection mechanism. Our method, called $\pi$-IW, interleaves width-based planning and policy learning using the state-actions visited by the planner. The policy estimate takes the form of a neural network and is in turn used to guide the planning step, thus reinforcing promising paths. Surprisingly, we observe that the representation learned by the neural network can be used as a feature space for the width-based planner without degrading its performance, thus removing the requirement of pre-defined features for the planner. We compare $\pi$-IW with previous width-based methods and with AlphaZero, a method that also interleaves planning and learning, in simple environments, and show that $\pi$-IW has superior performance. We also show that $\pi$-IW algorithm outperforms previous width-based methods in the pixel setting of Atari games suite.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

1904.07091

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.68)

arXiv.org Artificial IntelligenceApr-10-2019

ReinBo: Machine Learning pipeline search and configuration with Bayesian Optimization embedded Reinforcement Learning

Sun, Xudong, Lin, Jiali, Bischl, Bernd

Machine learning pipeline potentially consists of several stages of operations like data preprocessing, feature engineering and machine learning model training. Each operation has a set of hyper-parameters, which can become irrelevant for the pipeline when the operation is not selected. This gives rise to a hierarchical conditional hyper-parameter space. To optimize this mixed continuous and discrete conditional hierarchical hyper-parameter space, we propose an efficient pipeline search and configuration algorithm which combines the power of Reinforcement Learning and Bayesian Optimization. Empirical results show that our method performs favorably compared to state of the art methods like Auto-sklearn , TPOT, Tree Parzen Window, and Random Search.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

1904.05381

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)
(2 more...)

arXiv.org Machine LearningApr-8-2019

ASAP: Architecture Search, Anneal and Prune

Noy, Asaf, Nayman, Niv, Ridnik, Tal, Zamir, Nadav, Doveh, Sivan, Friedman, Itamar, Giryes, Raja, Zelnik-Manor, Lihi

Automatic methods for Neural Architecture Search (NAS) have been shown to produce state-of-the-art network models, yet, their main drawback is the computational complexity of the search process. As some primal methods optimized over a discrete search space, thousands of days of GPU were required for convergence. A recent approach is based on constructing a differentiable search space that enables gradient-based optimization, thus reducing the search time to a few days. While successful, such methods still include some incontinuous steps, e.g., the pruning of many weak connections at once. In this paper, we propose a differentiable search space that allows the annealing of architecture weights, while gradually pruning inferior operations, thus the search converges to a single output network in a continuous manner. Experiments on several vision datasets demonstrate the effectiveness of our method with respect to the search cost, accuracy and the memory footprint of the achieved model.

artificial intelligence, machine learning, opération, (16 more...)

1904.04123

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (0.97)
Information Technology > Artificial Intelligence > Vision (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.76)

Anthony, Thomas, Nishihara, Robert, Moritz, Philipp, Salimans, Tim, Schulman, John

Policy Gradient Search: Online Planning and Expert Iteration without Search Trees

arXiv.org Machine LearningApr-7-2019

Monte Carlo Tree Search (MCTS) algorithms perform simulation-based search to improve policies online. During search, the simulation policy is adapted to explore the most promising lines of play. MCTS has been used by state-of-the-art programs for many problems, however a disadvantage to MCTS is that it estimates the values of states with Monte Carlo averages, stored in a search tree; this does not scale to games with very high branching factors. We propose an alternative simulation-based search method, Policy Gradient Search (PGS), which adapts a neural network simulation policy online via policy gradient updates, avoiding the need for a search tree. In Hex, PGS achieves comparable performance to MCTS, and an agent trained using Expert Iteration with PGS was able defeat MoHex 2.0, the strongest open-source Hex agent, in 9x9 Hex.

algorithm, artificial intelligence, machine learning, (16 more...)

1904.03646

Country:

North America > Canada > Alberta (0.14)
North America > United States > California > Alameda County > Berkeley (0.04)
North America > Canada > Quebec > Montreal (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.48)

arXiv.org Artificial IntelligenceApr-6-2019

Monte Carlo Neural Fictitious Self-Play: Approach to Approximate Nash equilibrium of Imperfect-Information Games

Zhang, Li, Wang, Wei, Li, Shijian, Pan, Gang

Researchers on artificial intelligence have achieved human-level intelligence in large-scale perfect-information games, but it is still a challenge to achieve (nearly) optimal results (in other words, an approximate Nash Equilibrium) in large-scale imperfect-information games (i.e. war games, football coach or business strategies). Neural Fictitious Self Play (NFSP) is an effective algorithm for learning approximate Nash equilibrium of imperfect-information games from self-play without prior domain knowledge. However, it relies on Deep Q-Network, which is off-line and is hard to converge in online games with changing opponent strategy, so it can't approach approximate Nash equilibrium in games with large search scale and deep search depth. In this paper, we propose Monte Carlo Neural Fictitious Self Play (MC-NFSP), an algorithm combines Monte Carlo tree search with NFSP, which greatly improves the performance on large-scale zero-sum imperfect-information games. Experimentally, we demonstrate that the proposed Monte Carlo Neural Fictitious Self Play can converge to approximate Nash equilibrium in games with large-scale search depth while the Neural Fictitious Self Play can't. Furthermore, we develop Asynchronous Neural Fictitious Self Play (ANFSP). It use asynchronous and parallel architecture to collect game experience. In experiments, we show that parallel actor-learners have a further accelerated and stabilizing effect on training.

artificial intelligence, machine learning, reinforcement learning, (12 more...)

1903.09569

Country: North America > United States (0.28)

Genre: Research Report (0.40)

Industry: Leisure & Entertainment > Games > Computer Games (0.67)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.89)
(2 more...)

Liu, Yunlong, Zheng, Jianyang

Combining Offline Models and Online Monte-Carlo Tree Search for Planning from Scratch

arXiv.org Artificial IntelligenceApr-5-2019

Planning in stochastic and partially observable environments is a central issue in artificial intelligence. One commonly used technique for solving such a problem is by constructing an accurate model firstly. Although some recent approaches have been proposed for learning optimal behaviour under model uncertainty, prior knowledge about the environment is still needed to guarantee the performance of the proposed algorithms. With the benefits of the Predictive State Representations (PSRs) approach for state representation and model prediction, in this paper, we introduce an approach for planning from scratch, where an offline PSR model is firstly learned and then combined with online Monte-Carlo tree search for planning with model uncertainty. By comparing with the state-of-the-art approach of planning with model uncertainty, we demonstrated the effectiveness of the proposed approaches along with the proof of their convergence. The effectiveness and scalability of our proposed approach are also tested on the RockSample problem, which are infeasible for the state-of-the-art BA-POMDP based approaches.

artificial intelligence, machine learning, psr model, (17 more...)

1904.03008

Country:

Asia > China > Fujian Province > Xiamen (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Virginia > Arlington County > Arlington (0.04)
North America > United States > Massachusetts (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Hütter, Jan-Christian, Mao, Cheng, Rigollet, Philippe, Robeva, Elina

Estimation of Monge Matrices

arXiv.org Machine LearningApr-5-2019

Monge matrices and their permuted versions known as pre-Monge matrices naturally appear in many domains across science and engineering. While the rich structural properties of such matrices have long been leveraged for algorithmic purposes, little is known about their impact on statistical estimation. In this work, we propose to view this structure as a shape constraint and study the problem of estimating a Monge matrix subject to additive random noise. More specifically, we establish the minimax rates of estimation of Monge and pre-Monge matrices. In the case of pre-Monge matrices, the minimax-optimal least-squares estimator is not efficiently computable, and we propose two efficient estimators and establish their rates of convergence. Our theoretical findings are supported by numerical experiments.

artificial intelligence, machine learning, matrix, (17 more...)

1904.03136

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.54)

arXiv.org Machine LearningApr-5-2019

Single-Path NAS: Designing Hardware-Efficient ConvNets in less than 4 Hours

Stamoulis, Dimitrios, Ding, Ruizhou, Wang, Di, Lymberopoulos, Dimitrios, Priyantha, Bodhi, Liu, Jie, Marculescu, Diana

Can we automatically design a Convolutional Network (ConvNet) with the highest image classification accuracy under the runtime constraint of a mobile device? Neural architecture search (NAS) has revolutionized the design of hardware-efficient ConvNets by automating this process. However, the NAS problem remains challenging due to the combinatorially large design space, causing a significant searching time (at least 200 GPU-hours). To alleviate this complexity, we propose Single-Path NAS, a novel differentiable NAS method for designing hardware-efficient ConvNets in less than 4 hours. Our contributions are as follows: 1. Single-path search space: Compared to previous differentiable NAS methods, Single-Path NAS uses one single-path over-parameterized ConvNet to encode all architectural decisions with shared convolutional kernel parameters, hence drastically decreasing the number of trainable parameters and the search cost down to few epochs. 2. Hardware-efficient ImageNet classification: Single-Path NAS achieves 74.96% top-1 accuracy on ImageNet with 79ms latency on a Pixel 1 phone, which is state-of-the-art accuracy compared to NAS methods with similar constraints (<80ms). 3. NAS efficiency: Single-Path NAS search cost is only 8 epochs (30 TPU-hours), which is up to 5,000x faster compared to prior work. 4. Reproducibility: Unlike all recent mobile-efficient NAS methods which only release pretrained models, we open-source our entire codebase at: https://github.com/dstamoulis/single-path-nas.

artificial intelligence, convnet, machine learning, (16 more...)

1904.02877

Country: North America > United States (0.68)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)