Goto

Collaborating Authors

 board configuration


Strongly Solving $7 \times 6$ Connect-Four on Consumer Grade Hardware

Böck, Markus

arXiv.org Artificial Intelligence

While the game Connect-Four has been solved mathematically and the best move can be effectively computed with search based methods, a strong solution in the form of a look-up table was believed to be infeasible. In this paper, we revisit a symbolic search method based on binary decision diagrams to produce strong solutions. With our efficient implementation we were able to produce a 89.6 GB large look-up table in 47 hours on a single CPU core with 128 GB main memory for the standard $7 \times 6$ board size. In addition to this win-draw-loss evaluation, we include an alpha-beta search in our open source artifact to find the move which achieves the fastest win or slowest loss.


Estimating the number of reachable positions in Minishogi

Ishii, Sotaro, Tanaka, Tetsuro

arXiv.org Artificial Intelligence

To investigate the feasibility of strongly solving Minishogi (Gogo Shogi), it is necessary to know the number of its reachable positions from the initial position. However, there currently remains a significant gap between the lower and upper bounds of the value, since checking the legality of a Minishogi position is difficult. In this paper, the authors estimate the number of reachable positions by generating candidate positions using uniform random sampling and measuring the proportion of those reachable by a series of legal moves from the initial position. The experimental results reveal that the number of reachable Minishogi positions is approximately $2.38\times 10^{18}$.


Vision Encoder-Decoder Models for AI Coaching

Nayak, Jyothi S, Khan, Afifah Khan Mohammed Ajmal, Manjeshwar, Chirag, Banday, Imadh Ajaz

arXiv.org Artificial Intelligence

This research paper introduces an innovative AI coaching approach by integrating vision-encoder-decoder models. The feasibility of this method is demonstrated using a Vision Transformer as the encoder and GPT-2 as the decoder, achieving a seamless integration of visual input and textual interaction. Departing from conventional practices of employing distinct models for image recognition and text-based coaching, our integrated architecture directly processes input images, enabling natural question-and-answer dialogues with the AI coach. This unique strategy simplifies model architecture while enhancing the overall user experience in human-AI interactions. We showcase sample results to demonstrate the capability of the model. The results underscore the methodology's potential as a promising paradigm for creating efficient AI coach models in various domains involving visual inputs. Importantly, this potential holds true regardless of the particular visual encoder or text decoder chosen. Additionally, we conducted experiments with different sizes of GPT-2 to assess the impact on AI coach performance, providing valuable insights into the scalability and versatility of our proposed methodology.


Learning Temporally Extended Skills in Continuous Domains as Symbolic Actions for Planning

Achterhold, Jan, Krimmel, Markus, Stueckler, Joerg

arXiv.org Artificial Intelligence

Problems which require both long-horizon planning and continuous control capabilities pose significant challenges to existing reinforcement learning agents. In this paper we introduce a novel hierarchical reinforcement learning agent which links temporally extended skills for continuous control with a forward model in a symbolic discrete abstraction of the environment's state for planning. We term our agent SEADS for Symbolic Effect-Aware Diverse Skills. We formulate an objective and corresponding algorithm which leads to unsupervised learning of a diverse set of skills through intrinsic motivation given a known state abstraction. The skills are jointly learned with the symbolic forward model which captures the effect of skill execution in the state abstraction. After training, we can leverage the skills as symbolic actions using the forward model for long-horizon planning and subsequently execute the plan using the learned continuous-action control skills. The proposed algorithm learns skills and forward models that can be used to solve complex tasks which require both continuous control and long-horizon planning capabilities with high success rate. It compares favorably with other flat and hierarchical reinforcement learning baseline agents and is successfully demonstrated with a real robot.


Cable Routing and Assembly using Tactile-driven Motion Primitives

Wilson, Achu, Jiang, Helen, Lian, Wenzhao, Yuan, Wenzhen

arXiv.org Artificial Intelligence

Manipulating cables is challenging for robots because of the infinite degrees of freedom of the cables and frequent occlusion by the gripper and the environment. These challenges are further complicated by the dexterous nature of the operations required for cable routing and assembly, such as weaving and inserting, hampering common solutions with vision-only sensing. In this paper, we propose to integrate tactile-guided low-level motion control with high-level vision-based task parsing for a challenging task: cable routing and assembly on a reconfigurable task board. Specifically, we build a library of tactile-guided motion primitives using a fingertip GelSight sensor, where each primitive reliably accomplishes an operation such as cable following and weaving. The overall task is inferred via visual perception given a goal configuration image, and then used to generate the primitive sequence. Experiments demonstrate the effectiveness of individual tactile-guided primitives and the integrated end-to-end solution, significantly outperforming the method without tactile sensing. Our reconfigurable task setup and proposed baselines provide a benchmark for future research in cable manipulation. More details and video are presented in \url{https://helennn.github.io/cable-manip/}


AlphaZero, a novel Reinforcement Learning Algorithm, deployed in JavaScript

#artificialintelligence

In this blog post, you will learn about and implement AlphaZero, an exciting and novel Reinforcement Learning Algorithm, used to beat world-champions in games like Go and Chess. You will use it to master a pen-and-pencil game (Dots and Boxes) and deploy it into a web app, entirely in JavaScript. AlphaZero's key and most exciting aspect is its ability to gain superhuman behavior in board games without relying on external knowledge. AlphaZero learns to master the game by playing against itself (self-play) and learning from those experiences. We will leverage a "simplified, highly flexible, commented, and easy to understand implementation" Python version of AlphaZero from Surag Nair available in Github. You can go ahead and play the game here. The WebApp and JavaScript implementation are available here. This code was ported from this Python implementation.


The Minimax Algorithm in Artificial Intelligence

#artificialintelligence

Algorithms can search the game trees to determine the best move to make from the current state. The most well known is called the Minimax algorithm. The minimax algorithm is a useful method for simple two-player games. It is a method for selecting the best move given an alternating game where each player opposes the other working toward a mutually exclusive goal. Each player knows the moves that are possible given a current game state, so for each move, all subsequent moves can be discovered.


HEX and Neurodynamic Programming

Banerjee, Debangshu

arXiv.org Artificial Intelligence

Hex is a complex game with a high branching factor. For the first time Hex is being attempted to be solved without the use of game tree structures and associated methods of pruning. We also are abstaining from any heuristic information about Virtual Connections or Semi Virtual Connections which were previously used in all previous known computer versions of the game. The H-search algorithm which was the basis of finding such connections and had been used with success in previous Hex playing agents has been forgone. Instead what we use is reinforcement learning through self play and approximations through neural networks to by pass the problem of high branching factor and maintaining large tables for state-action evaluations. Our code is based primarily on NeuroHex. The inspiration is drawn from the recent success of AlphaGo Zero.


Playing Chess with Limited Look Ahead

Maesumi, Arman

arXiv.org Artificial Intelligence

We have seen numerous machine learning methods tackle the game of chess over the years. However, one common element in these works is the necessity of a finely optimized look ahead algorithm. The particular interest of this research lies with creating a chess engine that is highly capable, but restricted in its look ahead depth. We train a deep neural network to serve as a static evaluation function, which is accompanied by a relatively simple look ahead algorithm. We show that our static evaluation function has encoded some semblance of look ahead knowledge, and is comparable to classical evaluation functions. The strength of our chess engine is assessed by comparing its proposed moves against those proposed by Stockfish. We show that, despite strict restrictions on look ahead depth, our engine recommends moves of equal strength in roughly $83\%$ of our sample positions.


Reinforcement Learning Tic Tac Toe Python Implementation

#artificialintelligence

Reinforcement learning is a Machine Learning paradigm oriented on agents learning to take the best decisions in order to maximize a reward. It is a very popular type of Machine Learning algorithms because some view it as a way to build algorithms that act as close as possible to human beings: choosing the action at every step so that you get the highest reward possible. While in the other article we've explored the technical aspects of Reinforcement Learning, this time we will focus on the more practical aspects of the task. So let's jump right into the code. We will need to install only 2 dependencies for this one.