Autonomous cars use a variety of technologies like radar, lidar, odometry and computer vision to detect objects and people on the road, prompting it to adjust its trajectory accordingly. To tackle this problem, electrical engineers from University of California, San Diego used powerful machine learning techniques in a recent experiment that incorporated so-called deep learning algorithms in a pedestrian-detection system that performs in near real-time, using visual data only. The findings, which were presented at the International Conference on Computer Vision in Santiago, Chile, are an improvement over current methods of pedestrian detection, which uses something called cascade detection. This traditional form of classification architecture in computer vision takes a multi-stage approach that first breaks down an image into smaller image windows. These sub-images are then processed by whether they contain the presence of a pedestrian or not, using markers like shape and color.
We present a semi-supervised learning algorithm for learning discrete factor analysis models with arbitrary structure on the latent variables. Our algorithm assumes that every latent variable has an "anchor", an observed variable with only that latent variable as its parent. Given such anchors, we show that it is possible to consistently recover moments of the latent variables and use these moments to learn complete models. We also introduce a new technique for improving the robustness of method-of-moment algorithms by optimizing over the marginal polytope or its relaxations. We evaluate our algorithm using two real-world tasks, tag prediction on questions from the Stack Overflow website and medical diagnosis in an emergency department.
Deep dynamic generative models are developed to learn sequential dependencies in time-series data. The multi-layered model is designed by constructing a hierarchy of temporal sigmoid belief networks (TSBNs), defined as a sequential stack of sigmoid belief networks (SBNs). Each SBN has a contextual hidden state, inherited from the previous SBNs in the sequence, and is used to regulate its hidden bias. Scalable learning and inference algorithms are derived by introducing a recognition model that yields fast sampling from the variational posterior. This recognition model is trained jointly with the generative model, by maximizing its variational lower bound on the log-likelihood. Experimental results on bouncing balls, polyphonic music, motion capture, and text streams show that the proposed approach achieves state-of-the-art predictive performance, and has the capacity to synthesize various sequences.
To construct flexible nonlinear predictive distributions, the paper introduces a family of softplus function based regression models that convolve, stack, or combine both operations by convolving countably infinite stacked gamma distributions, whose scales depend on the covariates. Generalizing logistic regression that uses a single hyperplane to partition the covariate space into two halves, softplus regressions employ multiple hyperplanes to construct a confined space, related to a single convex polytope defined by the intersection of multiple half-spaces or a union of multiple convex polytopes, to separate one class from the other. The gamma process is introduced to support the convolution of countably infinite (stacked) covariate-dependent gamma distributions. For Bayesian inference, Gibbs sampling derived via novel data augmentation and marginalization techniques is used to deconvolve and/or demix the highly complex nonlinear predictive distribution. Example results demonstrate that softplus regressions provide flexible nonlinear decision boundaries, achieving classification accuracies comparable to that of kernel support vector machine while requiring significant less computation for out-of-sample prediction.
Monte Carlo tree search (MCTS) is extremely popular in computer Go which determines each action by enormous simulations in a broad and deep search tree. However, human experts select most actions by pattern analysis and careful evaluation rather than brute search of millions of future interactions. In this paper, we propose a computer Go system that follows experts’ way of thinking and playing. Our system consists of two parts. The first part is a novel deep alternative neural network (DANN) used to generate candidates of next move. Compared with existing deep convolutional neural network (DCNN), DANN inserts recurrent layer after each convolutional layer and stacks them in an alternative manner. We show such setting can preserve more contexts of local features and its evolutions which are beneficial for move prediction. The second part is a long-term evaluation (LTE) module used to provide a reliable evaluation of candidates rather than a single probability from move predictor. This is consistent with human experts’ nature of playing since they can foresee tens of steps to give an accurate estimation of candidates. In our system, for each candidate, LTE calculates a cumulative reward after several future interactions when local variations are settled. Combining criteria from the two parts, our system determines the optimal choice of next move. For more comprehensive experiments, we introduce a new professional Go dataset (PGD), consisting of $253,233$ professional records. Experiments on GoGoD and PGD datasets show the DANN can substantially improve performance of move prediction over pure DCNN. When combining LTE, our system outperforms most relevant approaches and open engines based on MCTS.