Goto

Collaborating Authors

 Search


Neural Networks for Chess

arXiv.org Artificial Intelligence

AlphaZero, Leela Chess Zero and Stockfish NNUE revolutionized Computer Chess. This book gives a complete introduction into the technical inner workings of such engines. The book is split into four main chapters -- excluding chapter 1 (introduction) and chapter 6 (conclusion): Chapter 2 introduces neural networks and covers all the basic building blocks that are used to build deep networks such as those used by AlphaZero. Contents include the perceptron, back-propagation and gradient descent, classification, regression, multilayer perceptron, vectorization techniques, convolutional networks, squeeze and excitation networks, fully connected networks, batch normalization and rectified linear units, residual layers, overfitting and underfitting. Chapter 3 introduces classical search techniques used for chess engines as well as those used by AlphaZero. Contents include minimax, alpha-beta search, and Monte Carlo tree search. Chapter 4 shows how modern chess engines are designed. Aside from the ground-breaking AlphaGo, AlphaGo Zero and AlphaZero we cover Leela Chess Zero, Fat Fritz, Fat Fritz 2 and Efficiently Updatable Neural Networks (NNUE) as well as Maia. Chapter 5 is about implementing a miniaturized AlphaZero. Hexapawn, a minimalistic version of chess, is used as an example for that. Hexapawn is solved by minimax search and training positions for supervised learning are generated. Then as a comparison, an AlphaZero-like training loop is implemented where training is done via self-play combined with reinforcement learning. Finally, AlphaZero-like training and supervised training are compared.


Spatio-Temporal Attack Course-of-Action (COA) Search Learning for Scalable and Time-Varying Networks

arXiv.org Artificial Intelligence

One of the key topics in network security research is the autonomous COA (Couse-of-Action) attack search method. Traditional COA attack search methods that passively search for attacks can be difficult, especially as the network gets bigger. To address these issues, new autonomous COA techniques are being developed, and among them, an intelligent spatial algorithm is designed in this paper for efficient operations in scalable networks. On top of the spatial search, a Monte-Carlo (MC)- based temporal approach is additionally considered for taking care of time-varying network behaviors. Therefore, we propose a spatio-temporal attack COA search algorithm for scalable and time-varying networks.


Reversible Action Design for Combinatorial Optimization with Reinforcement Learning

arXiv.org Artificial Intelligence

Combinatorial optimization problem (COP) over graphs is a fundamental challenge in optimization. Reinforcement learning (RL) has recently emerged as a new framework to tackle these problems and has demonstrated promising results. However, most RL solutions employ a greedy manner to construct the solution incrementally, thus inevitably pose unnecessary dependency on action sequences and need a lot of problem-specific designs. We propose a general RL framework that not only exhibits state-of-the-art empirical performance but also generalizes to a variety class of COPs. Specifically, we define state as a solution to a problem instance and action as a perturbation to this solution. We utilize graph neural networks (GNN) to extract latent representations for given problem instances for state-action encoding, and then apply deep Q-learning to obtain a policy that gradually refines the solution by flipping or swapping vertex labels. Experiments are conducted on Maximum $k$-Cut and Traveling Salesman Problem and performance improvement is achieved against a set of learning-based and heuristic baselines.


Overview of Machine Learning

#artificialintelligence

In layman's terms, machine learning is to allow computers to learn automatically from data to obtain certain knowledge. As a discipline, machine learning usually refers to a type of problem and the method to solve this type of problem, that is, how to find the law from the observation data, and use the learned law to predict the unknown or unobservable data. In the early engineering field, machine learning is often called pattern recognition, but pattern recognition is more biased towards specific application tasks, such as optical character recognition, speech recognition, and face recognition. The characteristic of these tasks is that for us humans, these tasks are easy to complete, but we do not know how we do it, so it is difficult to manually design a computer program to complete these tasks. A feasible method is to design an algorithm that allows the computer to learn the rules from the labeled samples and use it to complete various recognition tasks. With the increasing application of machine learning technology, the concept of machine learning is now gradually replacing pattern recognition, becoming the general term for this type of problem and its solutions. Taking handwritten digit recognition as an example, we need to allow the computer to automatically recognize handwritten digits. Handwritten digit recognition is a classic machine learning task, which is simple for humans, but very difficult for computers. It is difficult for us to summarize the handwriting characteristics of each digit, or the rules for distinguishing different digits, so designing a set of recognition algorithms is an almost impossible task. In real life, many problems are similar to those of handwritten number recognition, such as object recognition and speech recognition. For this kind of problem, we don't know how to design a computer program to solve it. Even if it can be realized by some heuristic rules, the process is extremely complicated. Therefore, people began to try another way of thinking, that is, let the computer see a large number of samples, and learn some experience from them, and then use these experiences to identify new samples. To recognize handwritten digits, first manually annotate a large number of handwritten digital images (that is, each image is manually marked with what number it is), these images are used as training data, and then a set of models are automatically generated through the learning algorithm, and rely on it. This method of learning through data is called the method of machine learning. First, we use a life example to introduce some basic concepts in machine learning: samples, features, labels, models, learning algorithms, etc. Suppose we want to buy mangoes in the market, but we have no previous experience in selecting mangoes, how can we obtain this knowledge through learning? First, we randomly select some mangoes from the market and list the characteristics of each mango.


Neural Approaches to Co-Optimization in Robotics

arXiv.org Artificial Intelligence

Robots and intelligent systems that sense or interact with the world are increasingly being used to automate a wide array of tasks. The ability of these systems to complete these tasks depends on a large range of technologies such as the mechanical and electrical parts that make up the physical body of the robot and its sensors, perception algorithms to perceive the environment, and planning and control algorithms to produce meaningful actions. Therefore, it is often necessary to consider the interactions between these components when designing an embodied system. This thesis explores work on the task-driven co-optimization of robotics systems in an end-to-end manner, simultaneously optimizing the physical components of the system with inference or control algorithms directly for task performance. We start by considering the problem of optimizing a beacon-based localization system directly for localization accuracy. Designing such a system involves placing beacons throughout the environment and inferring location from sensor readings. In our work, we develop a deep learning approach to optimize both beacon placement and location inference directly for localization accuracy. We then turn our attention to the related problem of task-driven optimization of robots and their controllers. In our work, we start by proposing a data-efficient algorithm based on multi-task reinforcement learning. Our approach efficiently optimizes both physical design and control parameters directly for task performance by leveraging a design-conditioned controller capable of generalizing over the space of physical designs. We then follow this up with an extension to allow for the optimization over discrete morphological parameters such as the number and configuration of limbs. Finally, we conclude by exploring the fabrication and deployment of optimized soft robots.


Controlling Perceived Emotion in Symbolic Music Generation with Monte Carlo Tree Search

arXiv.org Artificial Intelligence

This paper presents a new approach for controlling emotion in symbolic music generation with Monte Carlo Tree Search. We use Monte Carlo Tree Search as a decoding mechanism to steer the probability distribution learned by a language model towards a given emotion. At every step of the decoding process, we use Predictor Upper Confidence for Trees (PUCT) to search for sequences that maximize the average values of emotion and quality as given by an emotion classifier and a discriminator, respectively. We use a language model as PUCT's policy and a combination of the emotion classifier and the discriminator as its value function. To decode the next token in a piece of music, we sample from the distribution of node visits created during the search. We evaluate the quality of the generated samples with respect to human-composed pieces using a set of objective metrics computed directly from the generated samples. We also perform a user study to evaluate how human subjects perceive the generated samples' quality and emotion. We compare PUCT against Stochastic Bi-Objective Beam Search (SBBS) and Conditional Sampling (CS). Results suggest that PUCT outperforms SBBS and CS in almost all metrics of music quality and emotion.


Task planning in robotics

Robohub

Suppose we have a robot in a simple world like the one below. Let's consider commanding our robot to perform a task such as "take the apple from the shelf and put it on the table". Simple task planning example world: A robot can move between a finite set of locations, and can pick and place objects at those locations. I would argue we humans have pretty good intuition for how a robot could achieve this task. We could describe what the robot should do by breaking the solution down into individual actions.


Generating Intermediate Steps for NLI with Next-Step Supervision

arXiv.org Artificial Intelligence

The Natural Language Inference (NLI) task often requires reasoning over multiple steps to reach the conclusion. While the necessity of generating such intermediate steps (instead of a summary explanation) has gained popular support, it is unclear how to generate such steps without complete end-to-end supervision and how such generated steps can be further utilized. In this work, we train a sequence-to-sequence model to generate only the next step given an NLI premise and hypothesis pair (and previous steps); then enhance it with external knowledge and symbolic search to generate intermediate steps with only next-step supervision. We show the correctness of such generated steps through automated and human verification. Furthermore, we show that such generated steps can help improve end-to-end NLI task performance using simple data augmentation strategies, across multiple public NLI datasets.


Beyond Greedy Search: Tracking by Multi-Agent Reinforcement Learning-based Beam Search

arXiv.org Artificial Intelligence

To track the target in a video, current visual trackers usually adopt greedy search for target object localization in each frame, that is, the candidate region with the maximum response score will be selected as the tracking result of each frame. However, we found that this may be not an optimal choice, especially when encountering challenging tracking scenarios such as heavy occlusion and fast motion. To address this issue, we propose to maintain multiple tracking trajectories and apply beam search strategy for visual tracking, so that the trajectory with fewer accumulated errors can be identified. Accordingly, this paper introduces a novel multi-agent reinforcement learning based beam search tracking strategy, termed BeamTracking. It is mainly inspired by the image captioning task, which takes an image as input and generates diverse descriptions using beam search algorithm. Accordingly, we formulate the tracking as a sample selection problem fulfilled by multiple parallel decision-making processes, each of which aims at picking out one sample as their tracking result in each frame. Each maintained trajectory is associated with an agent to perform the decision-making and determine what actions should be taken to update related information. When all the frames are processed, we select the trajectory with the maximum accumulated score as the tracking result. Extensive experiments on seven popular tracking benchmark datasets validated the effectiveness of the proposed algorithm.


CryoAI: Amortized Inference of Poses for Ab Initio Reconstruction of 3D Molecular Volumes from Real Cryo-EM Images

arXiv.org Artificial Intelligence

Cryo-electron microscopy (cryo-EM) has become a tool of fundamental importance in structural biology, helping us understand the basic building blocks of life. The algorithmic challenge of cryo-EM is to jointly estimate the unknown 3D poses and the 3D electron scattering potential of a biomolecule from millions of extremely noisy 2D images. Existing reconstruction algorithms, however, cannot easily keep pace with the rapidly growing size of cryo-EM datasets due to their high computational and memory cost. We introduce cryoAI, an ab initio reconstruction algorithm for homogeneous conformations that uses direct gradient-based optimization of particle poses and the electron scattering potential from single-particle cryo-EM data. CryoAI combines a learned encoder that predicts the poses of each particle image with a physics-based decoder to aggregate each particle image into an implicit representation of the scattering potential volume. This volume is stored in the Fourier domain for computational efficiency and leverages a modern coordinate network architecture for memory efficiency. Combined with a symmetrized loss function, this framework achieves results of a quality on par with state-of-the-art cryo-EM solvers for both simulated and experimental data, one order of magnitude faster for large datasets and with significantly lower memory requirements than existing methods.