AITopics

Country: North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)

Genre: Research Report (0.47)

Industry: Energy (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.35)

Norouzi, Mohammad, Fleet, David J., Salakhutdinov, Ruslan R.

Hamming Distance Metric Learning

Motivated by large-scale multimedia applications we propose to learn mappings from high-dimensional data to binary codes that preserve semantic similarity. Binary codes are well suited to large-scale applications as they are storage efficient and permit exact sub-linear kNN search. The framework is applicable to broad families of mappings, and uses a flexible form of triplet ranking loss. We overcome discontinuous optimization of the discrete mappings by minimizing a piecewise-smooth upper bound on empirical loss, inspired by latent structural SVMs. We develop a new loss-augmented inference algorithm that is quadratic in the code length. We show strong retrieval performance on CIFAR-10 and MNIST, with promising classification results using no more than kNN on the binary codes.

binary code, deep learning, neural network, (19 more...)

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia (0.14)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Wilson, Aaron, Fern, Alan, Tadepalli, Prasad

A Bayesian Approach for Policy Learning from Trajectory Preference Queries

We consider the problem of learning control policies via trajectory preference queries to an expert. In particular, the learning agent can present an expert with short runs of a pair of policies originating from the same state and the expert then indicates the preferred trajectory. The agent's goal is to elicit a latent target policy from the expert with as few queries as possible. To tackle this problem we propose a novel Bayesian model of the querying process and introduce two methods that exploit this model to actively select expert queries. Experimental results on four benchmark problems indicate that our model can effectively learn policies from trajectory preference queries and that active query selection can be substantially more efficient than random selection.

Collins, Michael, Cohen, Shay B.

Tensor Decomposition for Fast Parsing with Latent-Variable PCFGs

We describe an approach to speed-up inference with latent variable PCFGs, which have been shown to be highly effective for natural language parsing. Our approach is based on a tensor formulation recently introduced for spectral estimation of latent-variable PCFGs coupled with a tensor decomposition algorithm well-known in the multilinear algebra literature. We also describe an error bound for this approximation, which bounds the difference between the probabilities calculated by the algorithm and the true probabilities that the approximated model gives. Empirical evaluation on real-world natural language parsing data demonstrates a significant speed-up at minimal cost for parsing performance.

algorithm, artificial intelligence, natural language, (17 more...)

Country: North America > United States (0.14)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Gopal, Siddharth, Yang, Yiming, Bai, Bing, Niculescu-mizil, Alexandru

Bayesian models for Large-scale Hierarchical Classification

A challenging problem in hierarchical classification is to leverage the hierarchical relations among classes for improving classification performance. An even greater challenge is to do so in a manner that is computationally feasible for the large scale problems usually encountered in practice. This paper proposes a set of Bayesian methods to model hierarchical dependencies among class labels using multivari- ate logistic regression. Specifically, the parent-child relationships are modeled by placing a hierarchical prior over the children nodes centered around the parame- ters of their parents; thereby encouraging classes nearby in the hierarchy to share similar model parameters. We present new, efficient variational algorithms for tractable posterior inference in these models, and provide a parallel implementa- tion that can comfortably handle large-scale problems with hundreds of thousands of dimensions and tens of thousands of classes. We run a comparative evaluation on multiple large-scale benchmark datasets that highlights the scalability of our approach, and shows a significant performance advantage over the other state-of- the-art hierarchical methods.

artificial intelligence, bayesian inference, hierarchy, (17 more...)

Country: North America > United States (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Krizhevsky, Alex, Sutskever, Ilya, Hinton, Geoffrey E.

ImageNet Classification with Deep Convolutional Neural Networks

We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes.On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% which is considerably better than the previous state-of-the-art. The neural network, which has 60 million parameters and 650,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax. To make training faster,we used non-saturating neurons and a very efficient GPU implementation ofthe convolution operation. To reduce overfitting in the fully-connected layers we employed a recently-developed regularization method called "dropout" that proved to be very effective. We also entered a variant of this model in the ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry.

convolutional layer, deep learning, neural network, (18 more...)

Country:

North America > Canada > Ontario > Toronto (0.15)
Europe > Italy (0.14)

Genre: Research Report (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Huang, Jonathan, Alexander, Daniel

Probabilistic Event Cascades for Alzheimer's disease

Accurate and detailed models of the progression of neurodegenerative diseases such as Alzheimer's (AD) are crucially important for reliable early diagnosis and the determination and deployment of effective treatments. In this paper, we introduce the ALPACA (Alzheimer's disease Probabilistic Cascades) model, a generative model linking latent Alzheimer's progression dynamics to observable biomarker data. In contrast with previous works which model disease progression as a fixed ordering of events, we explicitly model the variability over such orderings among patients which is more realistic, particularly for highly detailed disease progression models. We describe efficient learning algorithms for ALPACA and discuss promising experimental results on a real cohort of Alzheimer's patients from the Alzheimer's Disease Neuroimaging Initiative.

bayesian inference, neurology, snapshot, (22 more...)

Country:

North America > United States > Oregon (0.14)
Asia > Middle East > Israel (0.14)

Genre: Research Report (0.93)

Industry: Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.69)

Bayesian Hierarchical Reinforcement Learning

Cao, Feng, Ray, Soumya

We describe an approach to incorporating Bayesian priors in the maxq framework for hierarchical reinforcement learning (HRL). We define priors on the primitive environment model and on task pseudo-rewards. Since models for composite tasks can be complex, we use a mixed model-based/model-free learning approach to find an optimal hierarchical policy. We show empirically that (i) our approach results in improved convergence over non-Bayesian baselines, given sensible priors, (ii) task hierarchies and Bayesian priors can be complementary sources of information, and using both sources is better than either alone, (iii) taking advantage of the structural decomposition induced by the task hierarchy significantly reduces the computational cost of Bayesian reinforcement learning and (iv) in this framework, task pseudo-rewards can be learned instead of being manually specified, leading to automatic learning of hierarchically optimal rather than recursively optimal policies.

Guzmán-rivera, Abner, Batra, Dhruv, Kohli, Pushmeet

Multiple Choice Learning: Learning to Produce Multiple Structured Outputs

The paper addresses the problem of generating multiple hypotheses for prediction tasks that involve interaction with users or successive components in a cascade. Given a set of multiple hypotheses, such components/users have the ability to automatically rank the results and thus retrieve the best one. The standard approach for handling this scenario is to learn a single model and then produce M-best Maximum a Posteriori (MAP) hypotheses from this model. In contrast, we formulate this multiple {\em choice} learning task as a multiple-output structured-output prediction problem with a loss function that captures the natural setup of the problem. We present a max-margin formulation that minimizes an upper-bound on this loss-function. Experimental results on the problems of image co-segmentation and protein side-chain prediction show that our method outperforms conventional approaches used for this scenario and leads to substantial improvements in prediction accuracy.

artificial intelligence, optimization problem, prediction, (18 more...)

Country: North America > United States (0.14)

Industry: Education (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Socher, Richard, Huval, Brody, Bath, Bharath, Manning, Christopher D., Ng, Andrew Y.

Convolutional-Recursive Deep Learning for 3D Object Classification

Recent advances in 3D sensing technologies make it possible to easily record color and depth images which together can improve object recognition. Most current methods rely on very well-designed features for this new 3D modality. We introduce amodel based on a combination of convolutional and recursive neural networks (CNN and RNN) for learning features and classifying RGB-D images. The CNN layer learns low-level translationally invariant features which are then given as inputs to multiple, fixed-tree RNNs in order to compose higher order features. RNNscan be seen as combining convolution and pooling into one efficient, hierarchical operation. Our main result is that even RNNs with random weights compose powerful features. Our model obtains state of the art performance on a standard RGB-D object dataset while being more accurate and faster during training andtesting than comparable architectures such as two-layer CNNs.

deep learning, neural network, rnn, (17 more...)