Goto

Collaborating Authors

 Search


Statistical mechanical analysis of sparse linear regression as a variable selection problem

arXiv.org Machine Learning

An algorithmic limit of compressed sensing or related variable-selection problems is analytically evaluated when a design matrix is given by an overcomplete random matrix. The replica method from statistical mechanics is employed to derive the result. The analysis is conducted through evaluation of the entropy, an exponential rate of the number of combinations of variables giving a specific value of fit error to given data which is assumed to be generated from a linear process using the design matrix. This yields the typical achievable limit of the fit error when solving a representative $\ell_0$ problem and includes the presence of unfavourable phase transitions preventing local search algorithms from reaching the minimum-error configuration. The associated phase diagrams are presented. A noteworthy outcome of the phase diagrams is, however, that there exists a wide parameter region where any phase transition is absent from the high temperature to the lowest temperature at which the minimum-error configuration or the ground state is reached. This implies that certain local search algorithms can find the ground state with moderate computational costs in that region. The theoretical evaluation of the entropy is confirmed by extensive numerical methods using the exchange Monte Carlo and the multi-histogram methods. Another numerical test based on a metaheuristic optimisation algorithm called simulated annealing is conducted, which well supports the theoretical predictions on the local search algorithms and we can find the ground state with high probability in polynomial time with respect to system size.


The Actor Search Tree Critic (ASTC) for Off-Policy POMDP Learning in Medical Decision Making

arXiv.org Artificial Intelligence

Off-policy reinforcement learning enables near-optimal policy from suboptimal experience, thereby provisions opportunity for artificial intelligence applications in healthcare. Previous works have mainly framed patient-clinician interactions as Markov decision processes, while true physiological states are not necessarily fully observable from clinical data. We capture this situation with partially observable Markov decision process, in which an agent optimises its actions in a belief represented as a distribution of patient states inferred from individual history trajectories. A Gaussian mixture model is fitted for the observed data. Moreover, we take into account the fact that nuance in pharmaceutical dosage could presumably result in significantly different effect by modelling a continuous policy through a Gaussian approximator directly in the policy space, i.e. the actor. To address the challenge of infinite number of possible belief states which renders exact value iteration intractable, we evaluate and plan for only every encountered belief, through heuristic search tree by tightly maintaining lower and upper bounds of the true value of belief. We further resort to function approximations to update value bounds estimation, i.e. the critic, so that the tree search can be improved through more compact bounds at the fringe nodes that will be back-propagated to the root. Both actor and critic parameters are learned via gradient-based approaches. Our proposed policy trained from real intensive care unit data is capable of dictating dosing on vasopressors and intravenous fluids for sepsis patients that lead to the best patient outcomes.


Safe learning-based optimal motion planning for automated driving

arXiv.org Machine Learning

This paper presents preliminary work on learning the search heuristic for the optimal motion planning for automated driving in urban traffic. Previous work considered search-based optimal motion planning framework (SBOMP) that utilized numerical or model-based heuristics that did not consider dynamic obstacles. Optimal solution was still guaranteed since dynamic obstacles can only increase the cost. However, significant variations in the search efficiency are observed depending weather dynamic obstacles are present or not. This paper introduces machine learning (ML) based heuristic that takes into account dynamic obstacles, thus adding to the performance consistency for achieving real-time implementation.


Parallel Architecture and Hyperparameter Search via Successive Halving and Classification

arXiv.org Artificial Intelligence

We present a simple and powerful algorithm for parallel black box optimization called Successive Halving and Classification (SHAC). The algorithm operates in $K$ stages of parallel function evaluations and trains a cascade of binary classifiers to iteratively cull the undesirable regions of the search space. SHAC is easy to implement, requires no tuning of its own configuration parameters, is invariant to the scale of the objective function and can be built using any choice of binary classifier. We adopt tree-based classifiers within SHAC and achieve competitive performance against several strong baselines for optimizing synthetic functions, hyperparameters and architectures.


Correlation Heuristics for Constraint Programming

arXiv.org Artificial Intelligence

Backtracking search combined with constraint solving is the main approach to solve problems in Constraint Programming (CP). The key to effective search is having a good variable search heuristic to select a variable to branch as the size of the search tree is strongly dependent on the selected variables. In CP, many general purpose variable ordering search heuristics have been proposed and implemented in many CP solvers, such as the conflict-driven heuristic dom/wdeg [1], impactbased search (IBS) heuristic [2], and activity-based search (ABS) heuristic [3]. Search heuristics by their nature are not designed to be optimal search strategies but merely good ones. Thus, our goal in this paper is a new search heuristic which can outperform existing heuristics on some instances across a range of problems. We propose a new idea which is correlation-based search (CRBS), the search heuristic employs correlations between variables.


Monte Carlo Tree Search for Asymmetric Trees

arXiv.org Artificial Intelligence

We present an extension of Monte Carlo Tree Search (MCTS) that strongly increases its efficiency for trees with asymmetry and/or loops. Asymmetric termination of search trees introduces a type of uncertainty for which the standard upper confidence bound (UCB) formula does not account. Our first algorithm (MCTS-T), which assumes a non-stochastic environment, backs-up tree structure uncertainty and leverages it for exploration in a modified UCB formula. Results show vastly improved efficiency in a well-known asymmetric domain in which MCTS performs arbitrarily bad. Next, we connect the ideas about asymmetric termination to the presence of loops in the tree, where the same state appears multiple times in a single trace. An extension to our algorithm (MCTS-T+), which in addition to non-stochasticity assumes full state observability, further increases search efficiency for domains with loops as well. Benchmark testing on a set of OpenAI Gym and Atari 2600 games indicates that our algorithms always perform better than or at least equivalent to standard MCTS, and could be first-choice tree search algorithms for non-stochastic, fully-observable environments.


M-Walk: Learning to Walk in Graph with Monte Carlo Tree Search

arXiv.org Artificial Intelligence

Learning to walk over a graph towards a target node for a given input query and a source node is an important problem in applications such as knowledge base completion (KBC). It can be formulated as a reinforcement learning (RL) problem with a known state transition model. To overcome the challenge of sparse reward, we develop a graph-walking agent called M-Walk, which consists of a deep recurrent neural network (RNN) and Monte Carlo Tree Search (MCTS). The RNN encodes the state (i.e., history of the walked path) and maps it separately to a policy, a state value and state-action Q-values. In order to effectively train the agent from sparse reward, we combine MCTS with the neural policy to generate trajectories yielding more positive rewards. From these trajectories, the network is improved in an off-policy manner using Q-learning, which modifies the RNN policy via parameter sharing. Our proposed RL algorithm repeatedly applies this policy-improvement step to learn the entire model. At test time, MCTS is again combined with the neural policy to predict the target node. Experimental results on several graph-walking benchmarks show that M-Walk is able to learn better policies than other RL-based methods, which are mainly based on policy gradients. M-Walk also outperforms traditional KBC baselines.


Machine Learning in Compiler Optimisation โ€“ Arxiv Vanity

#artificialintelligence

EAs are useful for exploring a large optimisation space where it is infeasible to just enumerate all possible solutions. This is because an EA can often converge to the most promising area in the optimisation space quicker than a general search heuristic. The EA is also shown to be faster than a dynamic programming based search [24] in finding the optimal transformation for the Fast Fourier Transformation (FFT) [102]. When compared to supervised learning, EAs have the advantage of requiring little problem specific knowledge, and hence that they can be applied on a broad range of problems. However, because an EA typically relies on the empirical evidences (e.g.


Designing the Game to Play: Optimizing Payoff Structure in Security Games

arXiv.org Artificial Intelligence

Effective game-theoretic modeling of defender-attacker behavior is becoming increasingly important. In many domains, the defender functions not only as a player but also the designer of the game's payoff structure. We study Stackelberg Security Games where the defender, in addition to allocating defensive resources to protect targets from the attacker, can strategically manipulate the attacker's payoff under budget constraints in weighted L^p-norm form regarding the amount of change. Focusing on problems with weighted L^1-norm form constraint, we present (i) a mixed integer linear program-based algorithm with approximation guarantee; (ii) a branch-and-bound based algorithm with improved efficiency achieved by effective pruning; (iii) a polynomial time approximation scheme for a special but practical class of problems. In addition, we show that problems under budget constraints in L^0-norm form and weighted L^\infty-norm form can be solved in polynomial time. We provide an extensive experimental evaluation of our proposed algorithms.


Robot solves Rubik's Cube in 0.38 seconds

#artificialintelligence

This story was originally posted by Digital Trends. Whether it's beating us at games like the board game Go or stealing our jobs, the killer combination of artificial intelligence and robots are owning us puny humans left and right. The latest example of a high-tech achievement that will make you feel on the verge of extinction? A robot that's capable of completing a Rubik's Cube puzzle in just 0.38 seconds flat -- which includes image capture and computation time, along with physically moving the cube. Not only is that significantly faster than the human world record of 4.59 seconds, but it's also a big improvement on the official robot world record of 0.637 seconds, as set in late 2016.