Goto

Collaborating Authors

 Undirected Networks


POMCPOW: An online algorithm for POMDPs with continuous state, action, and observation spaces

arXiv.org Artificial Intelligence

Online solvers for partially observable Markov decision processes have been applied to problems with large discrete state spaces, but continuous state, action, and observation spaces remain a challenge. This paper begins by investigating double progressive widening (DPW) as a solution to this challenge. However, we prove that this modification alone is not sufficient because the belief representations in the search tree collapse to a single particle causing the algorithm to converge to a policy that is suboptimal regardless of the computation time. The main contribution of the paper is to propose a new algorithm, POMCPOW, that incorporates DPW and weighted particle filtering to overcome this deficiency and attack continuous problems. Simulation results show that these modifications allow the algorithm to be successful where previous approaches fail.


Diagnosing early-stage cervical cancer using artificial intelligence

#artificialintelligence

Using an artificial intelligence-based algorithm that uses scattered light data from tissues, researchers from IISER Kolkata and IIT Kanpur have been able to differentiate normal and precancerous tissue, and even identify the different stages of progression of the disease within a few minutes and with great accuracy. In vivo studies are now being carried out. The morphology of healthy and precancerous cervical tissue sites are quite different, and light that gets scattered from these tissues varies accordingly. Yet, it is difficult to discern with naked eyes the subtle differences in the scattered light characteristics of normal and precancerous tissue. Now, an artificial intelligence-based algorithm developed by a team of researchers from Indian Institute of Science Education and Research (IISER) Kolkata and Indian Institute of Technology (IIT) Kanpur makes this possible.


A Zero-Math Introduction to Markov Chain Monte Carlo Methods

@machinelearnbot

So, what are Markov chain Monte Carlo (MCMC) methods? In this article, I will explain that short answer, without any math. A parameter of interest is just some number that summarizes a phenomenon we're interested in. In general we use statistics to estimate parameters. For example, if we want to learn about the height of human adults, our parameter of interest might be average height in in inches.


On Statistical Optimality of Variational Bayes

arXiv.org Machine Learning

Variational inference [25, 7, 40] is now a well-established tool to approximate intractable posterior distributions in hierarchical multi-layered Bayesian models. The traditional Markov chain Monte Carlo (MCMC; [17]) approach of approximating distributions with intractable normalizing constants draws (correlated) samples according to a discrete-time Markov chain whose stationary distribution is the target distribution. Despite their success and popularity, MCMC methods can be slow to converge and lack scalability in big data problems and/or problems involving very many latent variables, which has fueled search for alternatives. In contrast to the sampling approach of MCMC, variational inference approaches the problem from an optimization viewpoint. First, a class of analytically tractable distributions, referred to as the variational family, is identified for the problem at hand. For example, in mean-field approximation, the set of parameters and latent variables is divided into blocks and the variational distribution is assumed to be independent across blocks.


Unsupervised Learning Course Web Page

@machinelearnbot

Aims: This course provides students with an in-depth introduction to statistical modelling and unsupervised learning techniques. It presents probabilistic approaches to modelling and their relation to coding theory and Bayesian statistics. A variety of latent variable models will be covered including mixture models (used for clustering), dimensionality reduction methods, time series models such as hidden Markov models which are used in speech recognition and bioinformatics, independent components analysis, hierarchical models, and nonlinear models. The course will present the foundations of probabilistic graphical models (e.g. We will cover Markov chain Monte Carlo sampling methods and variational approximations for inference. Time permitting, students will also learn about other topics in machine learning.


Non-convex Optimization for Machine Learning

arXiv.org Machine Learning

A vast majority of machine learning algorithms train their models and perform inference by solving optimization problems. In order to capture the learning and prediction problems accurately, structural constraints such as sparsity or low rank are frequently imposed or else the objective itself is designed to be a non-convex function. This is especially true of algorithms that operate in high-dimensional spaces or that train non-linear models such as tensor models and deep networks. The freedom to express the learning problem as a non-convex optimization problem gives immense modeling power to the algorithm designer, but often such problems are NP-hard to solve. A popular workaround to this has been to relax non-convex problems to convex ones and use traditional methods to solve the (convex) relaxed optimization problems. However this approach may be lossy and nevertheless presents significant challenges for large scale optimization. On the other hand, direct approaches to non-convex optimization have met with resounding success in several domains and remain the methods of choice for the practitioner, as they frequently outperform relaxation-based techniques - popular heuristics include projected gradient descent and alternating minimization. However, these are often poorly understood in terms of their convergence and other properties. This monograph presents a selection of recent advances that bridge a long-standing gap in our understanding of these heuristics. The monograph will lead the reader through several widely used non-convex optimization techniques, as well as applications thereof. The goal of this monograph is to both, introduce the rich literature in this area, as well as equip the reader with the tools and techniques needed to analyze these simple procedures for non-convex problems.


On Monte Carlo Tree Search and Reinforcement Learning

Journal of Artificial Intelligence Research

Fuelled by successes in Computer Go, Monte Carlo tree search (MCTS) has achieved widespread adoption within the games community. Its links to traditional reinforcement learning (RL) methods have been outlined in the past; however, the use of RL techniques within tree search has not been thoroughly studied yet. In this paper we re-examine in depth this close relation between the two fields; our goal is to improve the cross-awareness between the two communities. We show that a straightforward adaptation of RL semantics within tree search can lead to a wealth of new algorithms, for which the traditional MCTS is only one of the variants. We confirm that planning methods inspired by RL in conjunction with online search demonstrate encouraging results on several classic board games and in arcade video game competitions, where our algorithm recently ranked first. Our study promotes a unified view of learning, planning, and search.


Riemann-Theta Boltzmann Machine

arXiv.org Machine Learning

A general Boltzmann machine with continuous visible and discrete integer valued hidden states is introduced. Under mild assumptions about the connection matrices, the probability density function of the visible units can be solved for analytically, yielding a novel parametric density function involving a ratio of Riemann-Theta functions. The conditional expectation of a hidden state for given visible states can also be calculated analytically, yielding a derivative of the logarithmic Riemann-Theta function. The conditional expectation can be used as activation function in a feedforward neural network, thereby increasing the modelling capacity of the network. Both the Boltzmann machine and the derived feedforward neural network can be successfully trained via standard gradient- and non-gradient-based optimization techniques.


VAMPnets: Deep learning of molecular kinetics

arXiv.org Machine Learning

There is an increasing demand for computing the relevant structures, equilibria and long-timescale kinetics of biomolecular processes, such as protein-drug binding, from high-throughput molecular dynamics simulations. Current methods employ transformation of simulated coordinates into structural features, dimension reduction, clustering the dimension-reduced data, and estimation of a Markov state model or related model of the interconversion rates between molecular structures. This handcrafted approach demands a substantial amount of modeling expertise, as poor decisions at any step will lead to large modeling errors. Here we employ the variational approach for Markov processes (VAMP) to develop a deep learning framework for molecular kinetics using neural networks, dubbed VAMPnets. A VAMPnet encodes the entire mapping from molecular coordinates to Markov states, thus combining the whole data processing pipeline in a single end-to-end framework. Our method performs equally or better than state-of-the art Markov modeling methods and provides easily interpretable few-state kinetic models.


athenahealth: Data Scientists

@machinelearnbot

Join us to use cutting edge machine learning to unbreak healthcare in the US. In the US, physicians face huge informational challenges – from dealing with mountains of formulaic email to wrestling with arcane insurance rules to finding at-risk patients in their large client pools. Athenahealth's Data Science group is using advanced machine learning and AI to develop a new generation of smart tools that can help physicians by reducing their paperwork, finding at-risk patients, providing key information at the right time, and overall allowing physicians to focus on what's important: spending time with patients. We're seeking experienced data scientists who love machine learning and complex data and who care about making a positive impact on the world by fielding real ML-driven systems. Positions are available at multiple levels of seniority.